One average hides the run that breaks your throughput
A raw HTTP check on one product URL sits around 0.46–0.62 s from your machine. That number is useful only as a baseline. A browser route through residential exits runs p50 around 1.7 s on clean pages; p95 can land above 8 s on a bad stretch. Some blocked pages come back fast, so the average stays presentable while the run falls behind.
Latency changes more than one request's speed. It determines how many workers you need, how long sockets stay open, how many retries overlap, how much memory is tied up in browser contexts, and how much bandwidth gets spent before an accepted result lands. A fast 403 or a fast challenge page makes the scraper look quick while it's failing quietly.
curl and browser routes are different measurements
curl fetches one document. A browser route through the same proxy may touch the main document, scripts, consent code, API calls, preflights, and third-party requests that have nothing to do with the data. Comparing curl timing directly to Playwright timing produces noise, not a benchmark.
For raw HTTP routes, log curl's time_connect, time_appconnect, time_starttransfer, and time_total. For browser routes, log main document timing, total request count, timeout count, and accepted rows — same target, same endpoint group, same headers and blocking rules. Mixing the two shapes makes route comparison impossible.
p95 sets the worker count, not p50
The throughput math is straightforward: workers divided by request time. Ten workers at 500 ms can approach 20 requests per second before other limits. Ten workers at 2 s approach 5. Real jobs add parsing, queue overhead, browser startup, and target pacing, so the number is never exact — but it is enough to see why latency becomes an infrastructure decision.
When request time doubles, concurrency must double to hold the same scrape rate. p50 tells you the typical case. p95 tells you how many worker slots the tail requests are consuming at any given moment. A route with acceptable p50 and bad p95 behaves like a much smaller pool in practice.
| Workers | p50 request time | Approx. max throughput |
|---|---|---|
| 10 | 500 ms | ~20 req/s |
| 10 | 2 s | ~5 req/s |
| 10 | 500 ms p50 / 8 s p95 | tail slots consumed; effective ~3–6 req/s |
Adding workers early makes bad p95 worse
On a route with high tail latency, adding workers increases retry overlap and exit pressure without producing more accepted rows. Retries pile up behind each other, the route looks busier on the provider meter, and the accepted-row count barely moves. Fix p95 first — either by switching route or changing session mode — then scale workers.
The signal to watch is accepted rows per unit time, not response count. A route reporting 200 responses per minute with 60% rejections is producing fewer useful rows than a route reporting 80 responses per minute with 5% rejections.
Datacenter and residential fail in different shapes
Datacenter exits tend toward lower, steadier latency if the target allows that traffic. A nearby datacenter route may run cleanly until a small batch triggers a 403 wall. Residential exits run on consumer networks: inventory changes, and an exit that was fast before lunch can be slow after. Neither type hides that variance inside a single average.
The route split in a given job matters more than the provider headline. A residential route targeted in-country may be slower than a datacenter route but produce cleaner pages. A second residential route from a farther region may have acceptable p50 and lose the run on p95. That is not a problem with the proxy type; it is information about that route on that target.
The provider meter counts more than your app does
App-side payload counters measure saved HTML and accepted rows. The provider meter counts failed requests, redirects, retries, browser assets, and tunnel overhead. The gap is real cost. On Proxynade, the dashboard network log shows host, outcome, latency, and byte totals per request; usage logs export as CSV. Comparing those numbers against your app's accepted-row count shows exactly what the failures are costing in bandwidth.
Volume Residential is $0.89/GB and Premium Residential is $5.00/GB, billed per transferred byte. A route with high retry rates and bad p95 can multiply your effective cost per accepted row by 3–5x over a clean route on the same target.
The measurement checklist that stays next to the job config
Run one host, one endpoint group, one datacenter route, and one residential route with the same headers, session mode, and blocking rules. Log p50 and p95 latency, retry count, timeout count, accepted rows, and provider bytes. If accepted rows stop rising while p95 climbs, the route is bad for that target.
The note stays next to the job config because three weeks later the worker count that was set deliberately looks like a mistake.
- Log p50 and p95, not just averages.
- Separate proxy errors (
407, tunnel failures) from target errors (403, challenge pages). - Track accepted rows per gigabyte transferred, not request count.
- Set worker count from p95 observations, not p50.
- Export provider usage CSV before adjusting concurrency.
Proxy latency FAQ
Why does proxy latency affect scrape speed more than CPU? Workers sit idle while waiting on responses. When request time doubles, you need twice the concurrency just to hold the same throughput. CPU is rarely the bottleneck; open sockets and timeout slots are.
What latency numbers should I measure? Log p50 and p95 by route, endpoint group, and status code. p50 tells you the average case; p95 tells you how much concurrency the tail consumes. Averages flatten the bad stretches.
Why does the app byte counter show less than the provider meter? App counters usually measure saved HTML and accepted rows. Provider meters count failed requests, redirects, retries, browser assets, and tunnel overhead. The gap is real cost.
When does adding more workers hurt throughput? When p95 is already high, adding workers increases retry overlap and exit pressure without producing more accepted rows. Fix p95 first, then scale workers.
How do datacenter and residential latency profiles differ? Datacenter exits tend toward lower, steadier latency but hit harder blocks on some targets. Residential exits have more variance — an exit that was fast before lunch can be slow after. Neither is universally better; measure against your target.
What is the right metric for route quality? Accepted rows per gigabyte transferred, beside p95 latency and retry count. Response count and p50 alone hide routes that look acceptable but waste bandwidth on failures.