Your Framework Doesn't Matter

Last week I benchmarked four web frameworks and found that BlackSheep is 2x faster than FastAPI. A Rust-based server and JSON serializer pushed Python within striking distance of Go. Impressive numbers.

But I kept thinking — does any of this matter? Those benchmarks measured localhost throughput with no database and no network. That's not what users experience. A real API request crosses the internet, hits a framework, queries a database through an ORM, serializes the result, and travels back. How much of that time is actually the framework?

So I built a real app, deployed it, and measured every phase.

The App

A book catalog API. FastAPI + SQLAlchemy 2.0 (async) + asyncpg + Uvicorn — the standard Python stack that a developer following the FastAPI docs would use. No exotic dependencies, no optimization tricks.

Three tables: Publisher -> Author -> Book. Seeded with 4,215 real books from the Open Library API — Agatha Christie, Dostoevsky, Penguin Books, real data with real-world cardinality.

Deployed to Fly.io on a shared-cpu-1x machine with 512MB RAM and Postgres 17, both in Amsterdam. The cheapest setup you'd use for a side project.

Four endpoints:

GET /api/health — returns {"status": "ok"}. No database, no ORM, no serialization. Pure framework overhead.
GET /api/books/{id} — single book with author details. 4 SQL queries via selectinload.
GET /api/books?page=1&per_page=100 — 100 books with full details. 5 queries, selectinload.
GET /api/books/n-plus-one?page=1&per_page=100 — same data as #3, but with the classic N+1 bug. 302 queries (2 + 100 x 3 individual SELECTs).

Endpoint #4 is the "what not to do" scenario. Same response, same data, but instead of letting SQLAlchemy batch the loads, each book triggers separate queries for its author, publisher, and sibling books.

How I Measured

Every response carries timing headers measured with time.perf_counter(). The database layer uses SQLAlchemy's before_cursor_execute / after_cursor_execute events to split ORM overhead from raw driver time. A contextvars.ContextVar stores per-request timings so nothing leaks between concurrent requests.

The client measures total round-trip time. Network = client total - server total.

I ran 200 requests per endpoint from Turkey to Amsterdam (~57ms baseline RTT), with 30 warmup requests discarded. All numbers below are medians.

Where Does Server Time Go?

Let's start with what happens inside the server — no network, just the work Python does.

DB Driver ORM Serialize Encode Framework

Health check (no DB, no work) — 0.3ms server, 0 queries

0.04ms 0.25ms

Single book + author — 11.5ms server, 4 queries

6.0ms 4.6ms

100 books, optimized — 30.2ms server, 5 queries

10.6ms 14.2ms 3.1ms

100 books, N+1 queries — 491.9ms server, 302 queries

330.4ms 152.1ms

Hover over any segment for percentages. Bars don't sum to exactly 100% — a small residual (1-3%) falls between the timed sections.

The health check tells the story immediately. When there's no database, the framework is the server — 82% of 0.3ms. But the moment you add real work, it disappears. For the optimized 100-book query, the DB driver and ORM together account for 82% of server time. Serialization is 10%. The framework — FastAPI's routing, middleware, dependency injection — is 2-3%. For the single book endpoint, it's 4%.

The N+1 scenario is brutal. Same data, same response, but 302 queries instead of 5. Server time goes from 30ms to 492ms — a 16x increase — because each of those 302 queries pays a round-trip to Postgres and an ORM hydration cost.

But this is still only the server's perspective. What does the user actually experience?

Now Zoom Out

Same four endpoints, but now we include what happens before and after the server: DNS, TCP, TLS, request travel, response travel — all lumped together as "Network."

Pick a distance to see how it changes the picture:

Network DB Driver ORM Serialize Encode Framework

Health check (no DB, no work) — 69.6ms total

69.2ms

Single book + author — 68.8ms total

56.9ms 6.0ms 4.6ms

100 books, optimized — 97.0ms total

66.6ms 10.6ms 14.2ms 3.1ms

100 books, N+1 queries — 613.2ms total

82.2ms 330.4ms 152.1ms

Framework is 0.2–0.9% of total response time

Hover over any segment for percentages. Server timings are constant — only network changes.

There it is. The health check — where the framework has nothing to do except route and respond — is 99% network. The server finishes in 0.3ms. The user waits 70ms.

For a single book lookup, 83% of what the user waits for is the network. The entire server — framework, ORM, database, serialization, JSON encoding — is the remaining 17%. The framework specifically is 0.7%.

For 100 books with proper queries, network is 69%. The server does more work (30ms vs 12ms), but the user still spends most of their time waiting for packets to cross the internet.

These numbers default to my setup — I live in Ankara, Turkey, and my closest Fly.io region is Amsterdam. Try the presets above to see how distance changes the picture. Even in the best case — same building, 5ms — network is still 30% of a single book lookup. And most SaaS products aren't running multi-region deployments with edge nodes. They have one server in one region.

The N+1 scenario flips everything. Network drops to 13% — not because the network got faster, but because the server got so slow (492ms) that it dwarfs the network time. This is the only scenario where server-side code meaningfully impacts user experience. And the cause isn't the framework — it's 302 queries instead of 5.

Framework Overhead Across All Scenarios

Scenario	Total	Framework	Framework %
Health check (no DB)	69.6ms	0.2ms	0.4%
Single book	68.8ms	0.5ms	0.7%
100 books (optimized)	97.0ms	0.7ms	0.8%
100 books (N+1)	613.2ms	1.3ms	0.2%

The health check is the best case for the framework — no database, no ORM, no serialization. The server does almost nothing. And still, framework overhead is 0.2ms out of a 70ms request. FastAPI's routing, middleware, dependency injection, and ASGI handling cost 0.2-1.3ms across all scenarios. That's the thing benchmarks compare when they say "FastAPI vs BlackSheep" or "Python vs Go." The thing that accounts for less than 1% of what users experience.

In my previous benchmark, BlackSheep was 2x faster than FastAPI. That 2x difference applies to 0.7% of the total response time. Switching frameworks would save roughly 0.25ms on a 69ms request.

Putting Traffic in Perspective

Let's say your API gets 1 million requests per day. That sounds like a lot. It's 12 requests per second.

Daily Requests	Avg req/s	Peak req/s (3x avg)
100,000	1.2	3.5
1,000,000	11.6	35
10,000,000	115.7	347

Levels.fyi — a site with 1-2 million monthly uniques and over $1M ARR — runs one of its most trafficked services on a single Node.js instance serving 60K requests per hour. That's 17 req/s. FastAPI handles 46,000 req/s on a single worker in my benchmarks. You have roughly 2,700x headroom.

In 2016, Stack Overflow served 209 million HTTP requests per day — about 2,400 req/s average — on 9 web servers. Nick Craver said they'd unintentionally tested running on a single server, and it worked.

Framework throughput differences don't matter when your actual traffic is three orders of magnitude below capacity.

What I Didn't Measure

This is a sequential measurement from a single client — no concurrent load. Under concurrency, connection pooling, async scheduling, and GIL contention could change the server-side breakdown. The "Framework" bucket lumps together Uvicorn, Starlette, and FastAPI — I didn't separate them. "Network" lumps DNS, TLS, TCP, and raw packet travel. Response sizes are pre-compression (the real responses would be smaller over gzip).

At scale, a faster framework means fewer servers — that's real cost savings. But "at scale" means hundreds of thousands of requests per second, not millions per day. And long before you get there, you'll have optimized your queries, added caching, moved to handwritten SQL, and maybe even forked your runtime — Facebook built their own Python before they worried about framework overhead.

All measurements: 200 samples each, medians, from Turkey to Amsterdam. The raw data is in the repository.

What I Learned

Deploy closer to your users. For well-written queries, 69-83% of response time is packets crossing the internet. No framework optimization changes this. If your server is in Amsterdam and your users are in Ankara, they're waiting 57ms before your code even runs. Move the server, or put a cache at the edge.

Fix your queries, not your framework. The N+1 bug turned a 97ms response into a 613ms one — 6.3x slower — and framework overhead was still only 0.2%. Switching from FastAPI to BlackSheep would save 0.25ms. Fixing the N+1 bug saves 516ms. Profile your queries. Add selectinload. Use EXPLAIN ANALYZE. That's where the seconds are.

Pick your framework for everything except speed. Framework benchmarks compare the one component that doesn't matter (0.2-0.8% of total time) under conditions that don't exist (localhost, no database, no network). Pick for developer experience, documentation, ecosystem, and hiring. The framework that lets you ship faster is the fast framework.

If you want to see what actually makes a website fast in practice, Wes Bos has a great breakdown. Hint: it's not the framework.

Benchmarking is hard. I'm sure I got something wrong, missed an important variable, or made an assumption that doesn't hold. All the code, measurement scripts, and raw timing data are in the repository — please try to break it. If you find a flaw in the methodology, a timing error, or a scenario that would change the conclusions, I genuinely want to hear about it.