Skip to content
← Rift // HomeUC—02 / SCRAPE

§ Use Case — Scraping

Web scraping at
industrial scale.

195M+ rotating IPs, zero block lists, sub-second concurrency. Mix residential for hard targets and datacenter for bulk crawls — one gateway, one auth, every endpoint your stack needs.

§ Pillars

Why operators run this stack on Rift.

01

195M+ IP rotation pool

Per-request rotation across ethically sourced residential, plus 1M+ datacenter IPs for high-volume bulk work.

02

Drop-in protocols

HTTP, HTTPS and SOCKS5 endpoints. Works out of the box with Puppeteer, Playwright, Scrapy, Selenium, cURL.

03

Unmetered concurrency

No connection caps. Spin up 10k parallel workers without throttling — pay only for the bandwidth you actually use.

Recommended product

Rotating Residential

Residential + Datacenter

Residential wins on hard targets (e-commerce, classifieds, social, SERPs). For high-volume bulk crawling of static sites, mix in Datacenter at $1/GB to cut costs by 60%+.

  • City + ASN targeting
  • Sticky sessions up to 30 min
  • Zero-log, ethically sourced
  • HTTP / HTTPS / SOCKS5
Configure Rotating Residential

Drop-in snippet

PYTHON
# Scrapy settings.py — rotating residential
DOWNLOADER_MIDDLEWARES = {
    "scrapy.downloadermiddlewares.httpproxy.HttpProxyMiddleware": 100,
}

import os
os.environ["http_proxy"]  = "http://user:[email protected]:7777"
os.environ["https_proxy"] = "http://user:[email protected]:7777"

Works with Puppeteer, Playwright, Selenium, Scrapy, cURL and every commercial scraping platform.

§ Decoded

Niche questions.

§ The scraping playbook

How teams scrape the open web on Rift

Modern web scraping is a routing problem disguised as a parsing problem. The hardest targets — search engines, social platforms, e-commerce catalogs, classifieds, travel inventory — invest more in IP-reputation and behavioural fingerprinting than they do in HTML. Rift Proxy gives crawl engineers a single gateway with four proxy types behind it, so the same code base can pull from a 195M+ residential pool for high-defence targets and from a 1M+ datacenter pool for bulk static crawls without rewriting a single request.

Choosing the right proxy type per target

Pick residential when the target runs Cloudflare Bot Management, DataDome, PerimeterX, Akamai Bot Manager or HUMAN. Residential IPs come from real consumer ISPs in 195 countries with city, ZIP and ASN targeting, so the request looks identical to a browser sitting in that exact neighbourhood. Pick datacenter when the target is a static catalog, an open API, a sitemap walk, or any workload where speed and cost beat stealth — at $1/GB datacenter cuts your bill by more than half on bulk jobs. For mobile-only experiences (in-app APIs, region-locked apps, carrier-billed flows) layer in mobile bandwidth so the request originates from a real 4G/5G carrier subnet.

Rotation, sticky sessions and concurrency

Two rotation modes cover almost every workload. Per-request rotation gives every outbound call a fresh IP — ideal for breadth-first crawls, SERP scraping, price monitoring and any job where each request is independent. Sticky sessions lock an IP for up to 30 minutes by appending a session token to the proxy username — required for paginated tracking, multi-step checkouts, login-gated content and anything that depends on a stable cookie jar. Concurrency is unmetered: spin up 10k parallel workers in Scrapy, Playwright clusters or your own Go/Rust scrapers and only pay for the bandwidth you actually consume. There are no connection caps, no per-IP rate limits imposed by us, and no overage tiers.

Stack compatibility — Scrapy, Playwright, Puppeteer, Selenium

Rift exposes plain HTTP, HTTPS and SOCKS5 endpoints, so any language with an HTTP client works without an SDK. Scrapy users wire the gateway into HttpProxyMiddleware; Playwright and Puppeteer users pass it via the --proxy-server launch flag or per-context proxy option; Selenium users configure it through seleniumwire or chrome options; requests, httpx, aiohttp, node-fetch, got and axios all accept it as a standard proxy URL. Geo and session targeting are encoded in the username — for example user-country-de-city-berlin-session-abc — so switching geos or sessions never requires a config reload.

Compliance, ethics and operational notes

The residential pool is sourced exclusively through opted-in SDK partnerships with revenue-share to the device owner — no malware, no hidden installs, no botnet supply. Rift maintains zero request logs (only aggregate bandwidth counters for billing), supports IP whitelisting and revocable per-key credentials from the console, and is operated under a clear acceptable-use policy that prohibits scraping personal data, CSAM, account takeover, and any activity that violates a target site's lawful terms. For enterprise crawl operations with committed-use volume, contact [email protected] for dedicated sub-pools and discounted per-TB pricing.

Open the rift. Tonight.

Spin up a key in 60 seconds. Pay-as-you-go, no commitment, no logs.