How Reserved Concurrency Can Save Your API from Crashing
Introduction
In modern cloud-native applications, AWS Lambda has become a cornerstone for building scalable, cost-effective, and event-driven architectures. Many teams use Lambda to power both synchronous workloads (like APIs) and asynchronous ones (like SQS-based processing). However, without proper configuration, a spike in one workload can unexpectedly impact the performance of another — especially when they share concurrency limits. This article walks you through a real-world scenario where an external system flooded an SQS queue, unintentionally bringing down a production API. We’ll explain how this happened, what could’ve prevented it, and how to protect your critical workloads using Reserved Concurrency and Max Concurrency settings in AWS Lambda.
Scenario: A Shared AWS Production Account
You’re running a modern serverless application architecture on AWS, and in your production account, you have two key workloads:
- Synchronous Workload: A customer-facing API built using API Gateway and AWS Lambda. It handles user requests in real-time.
- Asynchronous Workload: An SQS queue that triggers a Lambda function to process messages sent by an external system.
At normal traffic volumes, both systems coexist without issues. But this balance is fragile — especially when external integrations are involved.
The Problem: When One System Brings Down the Other
Suddenly, instead of the usual 5,000 messages per day, your SQS queue starts receiving 2,000 messages per minute — possibly due to a bug or retry storm from the external system.
Here’s what happens next:
- Lambda, connected to the SQS queue, scales aggressively to handle the sudden load.
- You’ve not defined a max concurrency limit for this SQS-triggered Lambda.
- Your AWS account hits the account-level Lambda concurrency limit (default: 1,000 concurrent executions).
- No other Lambda functions — including your production API Lambdas — can’t be invoked.
- Your web application crashes. Even though the API wasn’t at fault, it’s collateral damage from the SQS flood.
This is known as the “Poison Pill” problem in serverless architectures.
The Solution: Reserved Concurrency & Max Concurrency
Reserved Concurrency for Critical APIs
Reserved concurrency guarantees that a set number of concurrent executions are always available for a specific Lambda function.
If your API Lambda has reserved concurrency = 2
, then:
- These 2 slots are dedicated exclusively to your API Lambda.
- Even if your account concurrency limit is reached, these 20 slots remain protected.
This means your API stays available, no matter what’s happening elsewhere.
aws lambda put-function-concurrency \
--function-name <YourFunctionName> \
--reserved-concurrent-executions 2
*Configuring reserved concurrency for a function incurs no additional charges.
*Reserving concurrency for a function impacts the concurrency pool that’s available to other functions. For example, if you reserve 100 units of concurrency for function-a
, other functions in your account must share the remaining 900 units of concurrency, even if function-a
doesn't use all 100 reserved concurrency units.
Max Concurrency on SQS Lambda Event Source
To prevent unbounded scaling of SQS-triggered Lambdas, AWS allows setting a maximumConcurrency
value on the Event Source Mapping.
aws lambda update-event-source-mapping \
--uuid YourEventSourceMappingUUID \
--scaling-config '{"MaximumConcurrency": 50}'
This controls how many Lambda functions can be triggered in parallel for that queue — helping you isolate failures and contain traffic spikes.
Design for Isolation
- Use different AWS accounts or organizational units for highly critical and high-volume asynchronous workloads.
- Set alarms for sudden SQS message inflow rates or concurrency spikes using CloudWatch.
- Throttle untrusted or external SQS producers using API Gateway usage plans or event filters.
If you find this helpful, don’t forget to follow for more real-world AWS use cases and cloud architecture tips!