Patterns for exception handling across application layers: mapping, retry, severity, logging. Does NOT handle: exception hierarchies (use exception-builder), circuit breakers (use resilience-patterns), API errors (use api-design), logging config (use logging-builder). Use when designing error strategies, implementing retry logic, or mapping exceptions between layers. Recognizes: "error-handling", "exception handling", "retry pattern", "error strategy", "exception mapping", "error boundaries", "backoff pattern", "error propagation"
Resources
1Install
npx skillscat add 101mare/skill-library/error-handling Install via the SkillsCat registry.
Error Handling Patterns
Exception handling strategies across application layers.
Layer Strategy
Each layer has a specific error handling responsibility:
Infrastructure Layer Service Layer API/CLI Layer
(adapters, I/O) (business logic) (entry points)
Catch: external errors Catch: infra errors Catch: all AppError
Translate: to domain Translate: to domain Translate: to response
Propagate: domain exc Propagate: up Log: at boundary
Retry: transient Retry: if idempotent Return: status codeInfrastructure: Catch and Translate
class DatabaseAdapter:
def get_user(self, user_id: str) -> User:
try:
row = self._conn.execute("SELECT ...", (user_id,))
except psycopg2.OperationalError as e:
raise ConnectionError("database", "unreachable") from e
except psycopg2.IntegrityError as e:
raise DataIntegrityError(str(e)) from e
if row is None:
raise ResourceNotFound("user", user_id)
return User.from_row(row)Service: Business Logic Errors
class UserService:
def create_user(self, data: UserInput) -> User:
if self.repo.exists(data.email):
raise DuplicateError(f"User {data.email} already exists")
try:
user = User.from_input(data)
self.repo.save(user)
return user
except ConnectionError:
raise # Let infra errors propagateCLI/API: Catch All, Log, Exit
def main() -> int:
try:
config = load_config()
run(config)
return 0
except ConfigError as e:
logger.error("Configuration error: %s", e)
return 2
except AppError as e:
logger.error("Error: %s", e)
return 1
except KeyboardInterrupt:
logger.info("Interrupted by user")
return 130
except Exception:
logger.exception("Unexpected error")
return 3Exception Mapping at Boundaries
# Map infrastructure exceptions to domain exceptions
EXCEPTION_MAP = {
ConnectionRefusedError: lambda e: ConnectionError("service", str(e)),
TimeoutError: lambda e: ServiceTimeoutError(str(e)),
FileNotFoundError: lambda e: ResourceNotFound("file", str(e)),
}
def map_exception(exc: Exception) -> AppError:
"""Translate infrastructure exceptions to domain exceptions."""
for exc_type, factory in EXCEPTION_MAP.items():
if isinstance(exc, exc_type):
return factory(exc)
return AppError(f"Unexpected: {exc}")Retry with Exponential Backoff
import time
import logging
from typing import TypeVar, Callable
logger = logging.getLogger(__name__)
T = TypeVar("T")
def retry_with_backoff(
func: Callable[[], T],
max_attempts: int = 3,
base_delay: float = 1.0,
max_delay: float = 30.0,
retryable: tuple[type[Exception], ...] = (ConnectionError, TimeoutError),
) -> T:
"""Retry with exponential backoff for transient failures."""
for attempt in range(1, max_attempts + 1):
try:
return func()
except retryable as e:
if attempt == max_attempts:
logger.error(
"All %d attempts failed: %s",
max_attempts, e,
)
raise
delay = min(base_delay * (2 ** (attempt - 1)), max_delay)
logger.warning(
"Attempt %d/%d failed: %s. Retrying in %.1fs",
attempt, max_attempts, e, delay,
)
time.sleep(delay)
raise RuntimeError("Unreachable") # Type checker satisfactionUsage:
result = retry_with_backoff(
lambda: api_client.complete(prompt),
max_attempts=4,
retryable=(ConnectionError, TimeoutError),
)Severity Matrix
| Error Type | Retry? | Log Level | User Action |
|---|---|---|---|
| Connection timeout | Yes | WARNING | Wait |
| Authentication failure | No | ERROR | Fix credentials |
| Invalid input | No | WARNING | Fix input |
| Resource not found | No | INFO | Check request |
| Rate limited | Yes (with backoff) | WARNING | Wait |
| Server error (500) | Yes | ERROR | Wait or escalate |
| Config invalid | No | CRITICAL | Fix config, restart |
| Out of memory | No | CRITICAL | Reduce load |
Logging Exceptions Correctly
# GOOD: logger.exception() includes full traceback
try:
result = process(data)
except ProcessingError:
logger.exception("Processing failed for item %s", item_id)
raise
# GOOD: exc_info=True when not in except block
logger.error("Operation failed", exc_info=True)
# BAD: Loses traceback
except ProcessingError as e:
logger.error(f"Failed: {e}") # No traceback!
# BAD: Log then re-raise at same level (duplicate entries)
except ProcessingError:
logger.error("Failed") # Logged here...
raise # ...and logged again at the boundaryRule: Log at the handling boundary only. If you re-raise, don't log (the catcher will log).
Graceful Degradation
def process_with_fallback(document: Document) -> Result:
"""Try primary method, fall back to simpler approach."""
try:
return full_processing(document)
except ExternalServiceError:
logger.warning("Full processing unavailable, using fallback")
return basic_processing(document)
except ProcessingError:
logger.warning("Processing failed, returning partial result")
return Result(status="partial", data=document.metadata)Anti-Patterns
| Anti-Pattern | Problem | Fix |
|---|---|---|
except Exception: pass |
Silent failures | Catch specific, log, handle |
except: (bare) |
Catches SystemExit, KeyboardInterrupt | Specify exception type |
| Exception as flow control | Slow, unclear intent | Use return values, Optional |
| Catch-log-reraise at every layer | Duplicate log entries | Log only at handling boundary |
| Retry non-idempotent operations | Data duplication | Only retry safe operations |
| Infinite retry | Hangs forever | Always set max_attempts |
| Generic error messages | Impossible to debug | Include context (IDs, types) |
Checklist
- Each layer has clear error handling responsibility
- Infrastructure exceptions translated to domain exceptions
-
from eon all exception chaining - Retry only for transient, idempotent failures
- Exponential backoff with max delay
- Logging at handling boundary only (not at every re-raise)
- CLI/API catches AppError and translates to exit code/HTTP status
- No bare
except:anywhere