April 29, 2026 AI-generated· revisions: 1

The N+1 query problem isn't dead — it just moved up the stack

ORMs and dataloaders solved the obvious version. Then it came back as service-to-service calls, GraphQL resolvers, and serverless cold starts. Here's where to look now.

#performance#system-design#microservices#postgresql#graphql

Database query analytics on a dark monitor

Photo by Carlos Muza

Every backend dev I've onboarded in the last five years has heard the same lecture about N+1 queries from their ORM. They learn .includes() or joinedload or whatever the Sequelize equivalent is this month, they watch the slow query log shrink, and they move on. Problem solved.

Then they ship a microservice. Or a GraphQL gateway. Or a Lambda that fans out to three downstream APIs. And the same fundamental problem is back — costing 400ms per request — except now nobody recognizes it because there's no slow query log to point at.

The N+1 problem was never really about ORMs. It was about a loop calling something expensive inside it. Every time we move to a new layer of abstraction, we get to rediscover this lesson the hard way.

Database query analytics on a dark monitor

The classic version (the one everyone fixes)

Here's the textbook example, in case anyone reading this is new to the term:

const orders = await db.order.findMany({ where: { customerId } });
for (const order of orders) {
  const items = await db.orderItem.findMany({ where: { orderId: order.id } });
  // ...
}

50 orders → 51 queries. The fix is one line:

const orders = await db.order.findMany({
  where: { customerId },
  include: { items: true },
});

This version is well-handled now. ORMs warn you. Linters catch it. APM tools flag it. Junior devs learn it. Good.

The version that's eating your latency budget

Now look at this microservice handler:

async function getOrdersWithEnrichment(customerId: string) {
  const orders = await orderService.list({ customerId });
  const enriched = await Promise.all(
    orders.map(async (order) => ({
      ...order,
      shipping: await shippingService.get(order.shipmentId),
      payment: await paymentService.get(order.paymentId),
    }))
  );
  return enriched;
}

This looks fine. It uses Promise.all, so the calls are parallel. The orders query is one round-trip. The team is proud they avoided the obvious N+1.

But for 50 orders, you're making 101 HTTP calls to other services. Each one with TLS handshake overhead, service mesh sidecar latency, and a metrics-emission tax. Even at 5ms p50 per call, you're spending 500ms+ on this single endpoint — and that's if nothing is degraded.

The pattern is identical to the database N+1. We just don't recognize it because there's no SQL involved and the network calls are async, so the dashboard doesn't scream.

The mental shift: stop thinking of N+1 as a database problem. It's a fan-out problem. Anywhere you have a loop calling out to something expensive — whether that's a DB, a cache, a downstream service, an LLM API, or even a synchronous file read — you have the same shape of problem.

Where it's hiding now

I've seen the same pattern, with different costumes, in production systems this past year:

GraphQL resolvers. Nested resolvers without DataLoader-style batching are the textbook example. But even with DataLoader, I keep finding cases where the batched call goes to a downstream REST API that doesn't accept arrays — so DataLoader just resolves them sequentially internally. You think you fixed it; you didn't.

Serverless fan-out. Lambda calling another Lambda in a loop is a special kind of cursed. You pay cold-start tax 50 times instead of once, and CloudWatch shows the whole thing as "took 3 seconds" without breaking down why.

Cache-aside patterns. A loop with cache.get(key) looks free, but in Redis cluster mode each call is a network hop with a CRC16 slot calculation. MGET exists. Use it.

LLM tool calls. This one is new and brutal. An agent that loops embedding.create({ text }) per item, when the API accepts an array of up to 100 items per call, can be 50x more expensive in both money and latency. I've audited agent frameworks where this was the default behavior.

ORM findMany followed by a method call. Modern ORMs let you write things like users.map(u => u.computeReputation()), where computeReputation quietly hits the DB. Now your "single query" handler is actually doing N+1 again, two abstraction layers up.

The reason it keeps coming back

Every new infrastructure layer hides the cost of the call until it's load-tested. The promise of microservices is that "service calls are just function calls" — which is true at the API level, but a lie at the latency level. The promise of serverless is that "you don't think about servers" — true, except when the cold start cost is your problem. The promise of GraphQL is "ask for what you need, get it in one round-trip" — also a lie, because the resolver layer below is doing whatever it wants.

Each abstraction we adopt to manage complexity also hides the place where N+1 lives. So we keep needing to relearn it.

Server racks lit by blue indicators in a datacenter aisle

How I look for it now

When I review a slow endpoint, I don't open the SQL profiler first anymore. That's the problem we've already solved. I open the distributed trace and look for two things:

A span with a high count of children of the same name. If getOrders has 50 children all named shippingService.get, that's the smell. Doesn't matter that they're parallel — the cost is real, and someone, somewhere, is also paying for those 50 inbound requests.
A handler whose total time is much greater than its longest single span. That means it's serializing things, even if individually each thing is fast. Often this is an await inside a for...of loop that someone added "just for now."

Then I ask the only useful question: does the downstream support batching? Most do. MGET for Redis. Bulk endpoints for REST. Array inputs for embeddings APIs. IN clauses for SQL. If the answer is yes, the fix is straightforward and usually a 10x improvement. If the answer is no, the fix is harder — sometimes it means owning a cache layer, sometimes it means redesigning the upstream API. But at least you know what you're paying for.

The actually-useful takeaway

Stop teaching N+1 as a database problem. Teach it as a pattern: anytime you have a loop, ask what's inside the loop and what its real cost is. That framing carries forward when the team graduates to GraphQL, microservices, agents, or whatever the next abstraction is.

I've started using one rule of thumb in code review: if a function awaits inside a for...of, it gets a comment asking why. Not always wrong, but always worth justifying. About 60% of the time, the answer is "I didn't think about it" — and the fix is a Promise.all or a batched call.

The rest of the time, the loop is intentional. That's fine — but now we've at least named the cost.

The N+1 query problem isn't dead. We just got better at one of its instances and forgot the underlying lesson. Every generation of backend engineers gets to find it again, in whatever layer is shiny that year.