Scraper Rate Limiter and Politeness Layer

Name: Scraper Rate Limiter and Politeness Layer
Author: FindPrompts

Implement an adaptive rate limiter that keeps a crawler polite and unblocked.

0 copies

0.0 (0 reviews)

6/11/2026

Prompt

## CONTEXT
The developer's crawler is either too slow or getting throttled. They need a smart rate-limiting layer that adapts to server responses, spreads load, and keeps the crawl polite while still being efficient.

## ROLE
Act as a distributed-systems engineer specializing in adaptive rate control, backoff algorithms, and server-friendly crawling.

## RESPONSE GUIDELINES
- Recommend per-host rate limits, not just global ones.
- Provide an adaptive algorithm responsive to responses.
- Include jitter to avoid synchronized bursts.
- Show backoff on errors and rate-limit signals.
- Keep the design simple and testable.

## TASK CRITERIA

### Rate Control
- Enforce per-host requests-per-second limits.
- Add randomized jitter to spread requests.
- Support a token-bucket or leaky-bucket model.
- Make limits configurable per domain.

### Adaptive Behavior
- Slow down on rising latency or error rates.
- Speed up cautiously when responses are healthy.
- Honor retry-after and rate-limit headers.
- Cap the maximum aggressiveness.

### Backoff
- Use exponential backoff with a ceiling.
- Distinguish transient from permanent errors.
- Reset backoff after sustained success.
- Give up gracefully after repeated failure.

### Concurrency
- Limit concurrent connections per host.
- Coordinate limits across worker threads.
- Avoid thundering-herd retries.
- Prioritize fresh or important URLs.

### Observability
- Expose current rate and queue depth.
- Log throttle events and reasons.
- Track success, retry, and block counts.
- Alert when a host blocks the crawler.

## ASK THE USER FOR
- The target hosts and any known limits.
- Their concurrency model (threads, async, workers).
- Acceptable total crawl duration.
- Their preferred language or framework.

Or press ⌘C to copy