⏱ 7 min read
We keep hearing about async/await syntactic sugar and message queues and they make our applications faster, more responsive, more scalable. Teams rewrite synchronous endpoints to async ones and feel better about the decision. And when that's not enough, someone adds a message queue — and suddenly the system feels fixed.
But what if it wasn't broken in the way you thought?
Every system that starts struggling under load goes through the same ritual. Traffic climbs, latency spikes, and someone in the room says: "let's make it async." The team nods. It sounds right. And for a while, it works — or at least, the dashboard stops screaming.
Then the queue backs up. Or the thread pool saturates. Or the database starts timing out. And the team discovers that async didn't fix the problem. It just changed where the problem lived.
This isn't a criticism of async/await or message queues. They're genuinely useful - especially the message queues . But there's a persistent confusion in how teams reach for them — as a scaling tool rather than a decoupling tool — and it leads to systems that feel fast right up until they don't.
Imagine a restaurant where the kitchen is slow. Orders pile up, customers wait, tables sit occupied for too long.
Someone has an idea: hire a host to take orders at the door and give customers a buzzer. Now people don't stand in line at the counter — they sit down, browse their phones, and the kitchen calls them when food is ready. The entrance is clear. The restaurant feels more efficient.
But the kitchen is still slow. The same number of meals gets cooked per hour. The queue just moved from the door to the pager system — and now it's harder to see.
This is exactly what happens when you reach for async/await or a message queue as a scaling solution.
When you await an HTTP request or a database call in .NET, the current thread is released back to the thread pool while the work happens. Once it completes, the continuation resumes on another thread.
public async Task<Order> GetOrderAsync(int id)
{
// Thread is released here while the DB query runs
return await _db.Orders.FindAsync(id);
}
That's genuinely useful. In ASP.NET Core, a server handling thousands of concurrent I/O-bound requests with a small thread pool instead of spinning up thousands of threads is a better server. It's the right answer to thread pool exhaustion.
But notice what async doesn't do: it doesn't make the database query faster. It doesn't reduce the number of queries. If 1,000 requests come in simultaneously, the database still gets 1,000 queries — they just don't each hold a thread while waiting.
The thread is free. The work is not.
The queue is a debt counter, not a buffer
Adding a message queue feels like a more decisive fix. The API returns instantly. Traffic spikes are absorbed. The team ships it and calls it done.
But every message still needs a consumer. The consumer still hits the database, calls the downstream service, runs the business logic. What changed is where the pressure accumulates.
Think of it like a debt. If your producer sends 1,000 messages per second and your consumer handles 200, you're accumulating 800 messages of debt every second. After ten minutes: 480,000 unprocessed messages. The producer side looks fine. The queue is quietly growing.
When will you notice? When redelivery storms start hitting your consumer. When messages expire. When the lag is hours, not seconds.
xychart-beta
title "Queue debt over time (producer: 1000/s, consumer: 200/s)"
x-axis ["0m", "2m", "4m", "6m", "8m", "10m"]
y-axis "Unprocessed messages" 0 --> 500000
bar [0, 96000, 192000, 288000, 384000, 480000]
In my experience, most throughput and scalability problems aren't thread problems at all. They're resource contention problems — and no amount of concurrency management fixes a contention problem:
await doesn't skip the queue.async.Making the calling code async moves the problem one layer down. It doesn't remove it — it just makes it harder to see because you've separated the producer from the consumer.
None of this is a reason to avoid async/await. It's a reason to reach for it for the right reason: genuine thread pool pressure in high-concurrency I/O-bound workloads.
Message queues have their place too — decoupling services with different uptime requirements, absorbing traffic spikes without dropping requests, building retry logic for unreliable operations. These are real problems worth solving.
Just don't confuse those benefits with throughput. Decoupling and throughput are different properties. A well-decoupled system can still be slow. A fast system doesn't require a queue.
When throughput is the real problem, the levers are different:
graph LR
Q[Queue] -->|customer A–M| C1[Consumer 1]
Q -->|customer N–Z| C2[Consumer 2]
C1 --> DB1[(DB shard 1)]
C2 --> DB2[(DB shard 2)]
And measure the right thing. Consumer lag, not queue depth. Queue depth is a snapshot — it tells you how much work is waiting right now. Consumer lag tells you how fast you're falling behind. One is a count. The other is a velocity. You need the velocity to plan capacity.
async/await is a thread management tool. Message queues are a decoupling tool. Neither is a throughput tool on its own.
If your system is slow, profile before you refactor. Measure before you add infrastructure. The database index you're not adding, the N+1 query you haven't noticed, the downstream service with no timeout — these are more likely your problem than the absence of await.
Async shifts the work. Scale requires designing where the work goes.
If this resonates with you, share it with someone who just added a queue to a slow service and called it a day. I'd love to hear your thoughts — find me on LinkedIn and drop a comment.