LUC #10: Building Resilient Systems: Overload & Misuse Prevention
Proactive measures to prevent system misuse and resource overload
Welcome back to another edition of Level Up Coding’s newsletter.
In today’s issue:
Read time: 7 minutes
How do you prevent system misuse and resource overload?
Mass adoption is any system or application’s dream. But with that comes the risk of misuse and resource overload. Measures should be in place to ensure the quality of service across all users.
Last week, Twitter faced this exact problem. Their solution? Rate limiting. This involved restricting the number of requests a user or service can make on a system.
While it's certainly a viable solution for many cases, it isn't the only one. Let's take a look at some other alternatives that can be implemented into any system. It's important to note that these solutions should be implemented defensively, to avoid scenarios where ad-hoc remedies are required.
Throttling is a simple technique that slows the time it takes to process a task in order to minimize resource consumption. This is often used in conjunction with quotas or rate-limiting so that users aren't entirely cut off from the service but instead, the quality of service is lowered to a reasonable level.
This is a popular approach internet service providers take to minimize bandwidth congestion during peak traffic. Similarly, throttling requests on a server or API is also commonly done in software systems.
Authentication and Authorization
These are important security measures that minimize the risk of service misuse and denial of service attacks (DoS). It also helps identify and limit the access of bots and scraper accounts.
First, the requesting user or service would be verified and identified using a username and password or more sophisticated methods such as 2FA. Once they have been identified, the system would then determine which resources the requester can have excess to and their level of priority to the system’s resources (if applicable).
CAPTCHA aims to identify human requesters and deny access to bots. It does so by introducing human-solvable tests before granting access to the service or certain features. While this technique is a popular approach, its impact on the application’s accessibility is a notable consideration. Moreover, AI technology is making it increasingly difficult to distinguish between human requesters and bots.
Intrusion Detection and Prevention Systems
Specifically used to mitigate the risk of system attacks, this approach involves monitoring network traffic to identify malicious activity.
Intrusion Detection Systems (IDS) are used to alert and report on identified threats, whereas Intrusion Prevention Systems (IPS) aim to block them.
Beyond the requirement of identifying and blocking threats, some other solutions that prevent system overload are:
🔸 Load balancing: distribute requests across multiple servers.
🔸 Prioritization: ensuring critical requests have priority to system resources.
🔸 Circuit breaker pattern: prevent task retries that are likely to fail.
🔸 Concurrency limits: limit the number of connections that can be made to the system or the number of concurrently running tasks.
Preventing system overload and misuse requires a full team effort to employ defensive engineering. The techniques mentioned above should be implemented carefully to ensure legitimate requests are not restricted. A combination of multiple strategies should be used to develop a full-system approach that suits your system’s unique use case.
How do you choose the right database? (recap)
The performance of your application can suffer if you choose the incorrect database type, and going back on a bad choice can be time-consuming and expensive.
There are several types of databases, each designed and optimized for specific use cases; relational, document, graph, columnar, time-based, key-value, and time-series, to name a few.
Considerations that should be made to choose the optimal database for your use case:
How structured is your data?
How often will the schema change?
What type of queries do you need to run?
How large is your dataset and do you expect it to grow?
How large is each record?
What is the nature of the operations you need to run? Is it read-heavy or write-heavy?
Which databases do your team have experience with?
Git branching strategies (recap)
Feature branching — Each new feature has its own branch. Once the changes are complete and merged in, the branch can be deleted.
Gitflow — Has branches for features, releases, hotfixes, and a dedicated branch for production and development.
GitLab Flow — Branches are created for features, releases, and environments. The main branch is always production ready.
GitHub Flow — Branches are created for new features, bug fixes, and experiments. The main branch is the only deployable branch, it is kept production-ready.
Trunk-based — All branches other than the main branch are short-lived and merged within a defined timeframe (usually a day). Large features are built incrementally and hidden behind feature flags.
Boost the performance of your CI/CD pipelines (recap)
Hot tips for boosting the performance of your CI/CD pipelines
Start with identifying the bottlenecks and inefficiencies.
Identify processes that run in sequence and consider if they could instead run in parallel.
If possible, only run tests that relate to the set of changes.
Optimize your build process — check the efficiency of your build scripts, remove unnecessary dependencies, cache artifacts, and avoid unnecessary processes.
Ensure the infrastructure can support your pipeline to scale as needed.
That wraps up this week’s issue of Level Up Coding’s newsletter!
Join us again next week where we’ll explore end-to-end encryption, communication strategies for microservices, and non-primitive data structures.