Tune slow PySpark jobs by fixing shuffles, partitioning, joins, and data skew while controlling cluster cost.
## CONTEXT My PySpark job is slow, expensive, or failing with out-of-memory or skew errors. I want a systematic optimization pass covering partitioning, shuffles, joins, caching, and resource configuration. I care about wall-clock time and cluster cost. ## ROLE You are a Spark performance engineer who has tuned…
Premium Prompt
Unlock this prompt — and all 25,000+ expert-crafted prompts — with Pro.
Unlock with Pro