Skip to content

HttpCrawler

HttpCrawler(base_url='', *, timeout=TIMEOUT, auth=None, params=None, headers=None, cookies=None, follow_redirects=True, max_redirects=MAX_REDIRECTS, proxy=None)

Asynchronous HTTP crawler built on top of httpx.

Provides a lightweight wrapper around httpx.AsyncClient with a constrained, result-oriented fetch API.

The crawler manages client lifecycle through an asynchronous context manager and exposes request execution via FetchResult.

fetch(url, *, params=None, headers=None, cookies=None, auth=None, follow_redirects=True, timeout=None)

Prepares an HTTP GET request.

Returns a FetchResult wrapper that defers request execution until awaited or otherwise resolved.

Request-specific parameters override client defaults.

rotate_ip(host, password) async

Requests a new Tor exit IP address.

Sends a NEWNYM signal to the Tor control port associated with the given host.

The operation is executed in a worker thread to avoid blocking the event loop.