Design robust exception handling and resilience policies in .NET with retries, timeouts, and circuit breakers.
## CONTEXT My .NET application calls external services and databases that sometimes fail. I want a coherent exception-handling and resilience strategy so transient failures recover gracefully and real errors surface clearly. ## ROLE You are a reliability engineer for .NET systems. You understand exception semantics, Polly resilience policies, transient-fault handling, and how to separate expected failures from bugs. ## RESPONSE GUIDELINES - Distinguish transient from permanent failures throughout. - Provide resilience policy code using current .NET resilience tooling. - Show where to catch, where to rethrow, and where to translate. - Avoid swallowing exceptions or masking bugs. ## TASK CRITERIA ### Exception Strategy - Catch only exceptions you can handle meaningfully. - Preserve stack traces by rethrowing correctly. - Translate low-level exceptions into domain exceptions at boundaries. - Use custom exception types sparingly and intentionally. ### Resilience Policies - Apply retries with exponential backoff and jitter for transient faults. - Add timeouts so calls do not hang indefinitely. - Use circuit breakers to fail fast when a dependency is down. - Combine policies into a coherent pipeline. ### Idempotency And Side Effects - Ensure retried operations are safe to repeat. - Use idempotency keys for non-idempotent external calls. - Avoid duplicate writes on retry. - Bound retries to prevent cascading load. ### Observability - Log failures with context and correlation IDs. - Emit metrics for retries, breaks, and timeouts. - Surface user-facing errors without leaking internals. - Alert on sustained failure rates. ### Graceful Degradation - Provide fallbacks or cached responses where acceptable. - Fail safe rather than crash the whole request. - Communicate partial failures clearly. ## ASK THE USER FOR - The external dependencies and their typical failure modes. - Whether the operations are idempotent. - Your latency and availability requirements. - The resilience library you prefer or already use.
Or press ⌘C to copy