Scraping notes

How to choose proxies for scraping in 2026

Let the target fail the cheap option first. Logs that track provider-metered bytes tell you when to step up.

Datacenter, residential, and static ISP Small pilot first Updated 2026-06-12

Start cheap, then believe the bodies

Pick the route after the target shows its hand, not before. Start with a small pilot: normal headers, sane pacing, logs that capture body size, status, retry count, route type, and provider-metered bytes. If the HTML is stable and the failures are boring, there is no prize for moving to residential.

Datacenter is the first thing to try on easy public pages. News archives, public directories, simple catalogs, and feeds often accept hosting ASNs without objection. It fails when the target hates datacenter ranges, when body sizes go short, or when a clean 200 is really an empty shell.

Volume Residential at $0.89/GB billed per transferred byte is the first serious residential test for normal commercial pages — product listings, reviews, real estate, jobs, travel results, and geo-shaped pages. It is not magic. Test the country and session mode you will actually use before scaling.

Premium Residential at $5.00/GB billed per transferred byte is for the ugly middle, not for everything. The failure mode it fixes usually looks soft: short bodies, missing price blocks, blank grids, challenge pages, weird redirects, or a retry rate that appears tolerable in your app but awful in the provider dashboard. Use it on the page slice that proves it needs better IPs. Running a whole scrape on Premium because one page type is hostile is how bills get large.

Static ISP proxies are a different tool entirely. They are for accounts, carts, dashboards, long sessions, and workflows where the same IP has to persist. That is the right framing for Static ISP: it buys identity. It is not the right tool for pulling millions of anonymous pages.

Provider-metered bytes versus app bytes

The row that matters from a pilot is boring and a little ugly: site=retail-es route=vol-res session=sticky-20m status=200 app_body=184kb provider_delta=913kb retries=2 note=first_body_challenge_second_empty_grid. That row tells more than a success-rate chart.

The app saved one useful body. The provider counted retries, redirects, CONNECT overhead, TLS setup, challenge pages, scripts, images, and browser junk. Apps lie by omission — they count the thing they wanted, not the junk they paid to reach.

The Proxynade dashboard network log shows host, outcome, latency, and byte totals per request. Export it as CSV from usage logs and compare the provider column against your app's parsed-body column. A large gap points at either excessive retries or unrestricted asset loading.

What a real escalation looks like

One retail scrape looked fine on datacenter until product pages came back. Category pages loaded. Search pages loaded. Product pages returned 200, but a third of the bodies were too short and missing price blocks. Volume Residential fixed most pages after blocking images, fonts, and tracking scripts. Premium Residential only made sense for stock and delivery pages that changed by postcode. Static ISP did not belong in that run at all.

That pattern holds across most targets: datacenter for cheap public collection, Volume Residential for the first serious residential test, Premium Residential for soft-blocked slices, and Static ISP for accounts or long sessions. Most bad proxy bills come from mixing those jobs and calling it being thorough.

Block waste before escalating the plan

Before stepping from Volume to Premium, check whether the bandwidth delta comes from unblocked assets rather than the target's actual defense. Blocking images, fonts, and tracking scripts from a proxy-routed browser can cut provider-metered bytes by 60–80% on a page-rendered scrape, which changes the cost comparison entirely.

Proxy typeGood fitPoor fit
DatacenterPublic archives, feeds, simple catalogs, speed-priority testsTargets that ASN-filter hosting ranges, account workflows
Volume Residential ($0.89/GB)Commercial pages, geo-shaped content, first residential pilotHigh-defense checkout flows, pages needing a fixed exit
Premium Residential ($5.00/GB)Soft-blocked slices, challenge-heavy pages, postcode-specific pricingWhole-run default when only a slice is hard
Static ISPAccount sessions, carts, multi-step dashboards, long-lived exitsAnonymous bulk collection, anonymous page-level scraping

Proxy choice FAQ

When is datacenter good enough for scraping? When the target accepts hosting ASNs, bodies are full-size, and the failure rate is low on a small pilot run. News archives, public directories, and simple catalogs often clear this bar.

When should I move from datacenter to residential? When clean 200 responses come back with short or structurally incomplete bodies, or when the retry rate in the provider dashboard is significantly higher than what your app counts.

What is the difference between Volume and Premium Residential on a metered plan? Volume Residential is $0.89/GB billed per transferred byte. Premium Residential is $5.00/GB billed the same way. Use Premium only on the page slice that proves it needs better IPs, not for the whole run.

Why does provider-metered bandwidth exceed what my app counted? The provider meters every byte transferred through the proxy: redirect chains, TLS setup, CONNECT overhead, challenge pages, retries, images, fonts, and scripts. Your app typically counts only the final response body it parsed.

When should I use static ISP proxies instead of rotating residential? When the workflow requires the same IP across multiple requests — account sessions, cart operations, or multi-step dashboards. Static ISP proxies buy identity, not anonymity.

Pilot checklist

  • 100 URLs minimum across the actual page types you will scrape at scale.
  • Log status code, body size, provider-metered bytes, and retry count per request.
  • Check for short bodies on nominally successful 200 responses before declaring success.
  • Compare provider dashboard byte total against app-level byte total to find hidden waste.
  • Block images, fonts, and tracking scripts before escalating the plan tier.