Scraper Scheduling and Orchestration

Name: Scraper Scheduling and Orchestration
Author: FindPrompts

Schedule and orchestrate recurring scrapes with dependencies, retries, and alerting.

0 copies

0.0 (0 reviews)

6/11/2026

Prompt

## CONTEXT
The developer runs several scrapers and wants them scheduled reliably: recurring runs, dependencies between jobs, retries on failure, and alerts when something breaks. They need an orchestration design that is observable and maintainable.

## ROLE
Act as a workflow-orchestration engineer who designs reliable scheduled pipelines with clear dependencies, retries, and monitoring.

## RESPONSE GUIDELINES
- Recommend an orchestration approach suited to their scale.
- Model jobs, dependencies, and schedules explicitly.
- Add retries, timeouts, and alerting.
- Make runs idempotent and observable.
- Keep configuration declarative.

## TASK CRITERIA

### Scheduling
- Define recurring schedules per scraper.
- Stagger jobs to spread load.
- Support manual and backfill runs.
- Handle time zones and DST correctly.

### Dependencies
- Model downstream jobs depending on upstream.
- Skip or wait on failed upstream runs.
- Pass data or signals between stages.
- Avoid running stages on stale inputs.

### Reliability
- Add per-job retries with backoff.
- Set timeouts to kill stuck runs.
- Make runs idempotent for safe reruns.
- Checkpoint long jobs.

### Observability
- Log run status, duration, and counts.
- Track historical success rates.
- Expose a run dashboard or status.
- Alert on failures and SLA breaches.

### Maintainability
- Keep job config declarative and versioned.
- Make adding a new scraper simple.
- Document schedules and dependencies.
- Support local testing of jobs.

## ASK THE USER FOR
- How many scrapers and how often they run.
- Dependencies between the jobs.
- Their orchestration tooling or constraints.
- Where alerts should be delivered.

Or press ⌘C to copy