Celery gotchas

Async Reliability with Celery

Task Loss Prevention Between Web Process and Broker

Enable broker confirmation: Configure confirm_publish on RabbitMQ to ensure tasks are actually committed to the broker before the delay operation completes
Pass data references, not data values: Use S3 URLs or database IDs instead of passing large Python objects as task arguments to prevent gigantic tasks that can crash workers
Implement database-sourced task recovery: Use Celery Beat with periodic tasks that check the database and re-queue missed tasks (e.g., verification emails) for automatic recovery

Task Loss Prevention Between Broker and Worker

Set task_acks_late = True: Tasks remain on the broker until the worker acknowledges completion, enabling redelivery if workers crash
Use transaction.on_commit(): Only queue tasks after database transactions commit to avoid race conditions where tasks execute before data is saved
Make tasks idempotent: Use ORM methods like get_or_create() and update_or_create() so tasks can be safely retried multiple times
Wrap tasks in transaction.atomic(): Ensure database changes can be rolled back if tasks are interrupted

Worker Reliability Configuration

Set task_reject_on_worker_lost = True: Enable task redelivery even when workers die from memory errors or SIGKILL signals
Handle all exceptions properly: Treat task exceptions with the same care as 500 errors in web views, using Celery's retry functionality for intermittent failures
Use RabbitMQ over Redis/SQS: RabbitMQ's connection-based redelivery is more reliable than visibility timeout mechanisms

Deployment Safety

Empty queues before changing task signatures: Ensure no old tasks remain when modifying function parameters
Avoid ETA/countdown tasks beyond a few seconds: These live in worker memory and complicate deployments
Use graceful shutdown (SIGTERM): Avoid SIGKILL during deploys to prevent unintended task dropping

Alternative Approaches

Use dedicated workflow tools for complex orchestration: Consider Prefect, Temporal, or Airflow instead of Celery Canvas for complex workflows
Implement proper monitoring and alerting: Set up observability tools specifically for Celery task execution
Configure task time limits and expiration: Prevent clogged queues and outdated notifications

Celery Canvas Best Practices

Workflow Patterns Demonstrated

Single Sequential Task (all_in_one): Processes all work in one task iterating through items sequentially - simplest but not parallel
Parallel Tasks with Join: Queues multiple tasks simultaneously but demonstrates why you should never call result.get() within a task - causes RuntimeError
Chord Pattern: Uses chord to run parallel tasks and execute a callback after all complete - recommended for fan-out/fan-in workflows
Starmap for Parameter Mapping: Uses starmap to efficiently map function calls over parameter tuples
Fine-grained Parallelism: Shows how to break work into smaller parallel tasks for maximum throughput

Performance and Scalability Insights

Database-intensive work: Parallel task execution isn't always better - database query optimization often outperforms more concurrent tasks for DB-heavy operations
Task granularity matters: Breaking work into smaller tasks enables better parallelism but creates more overhead
Concurrency configuration: Use appropriate worker concurrency settings (example shows -c 8) based on your workload

Canvas Primitives Usage

Chord: Best for fan-out/fan-in patterns where you need to collect results after parallel execution
Starmap: Efficient for mapping a function over multiple parameter sets
Group: For simple parallel execution without result collection
Chain: For sequential task dependencies

Setup and Configuration Best Practices

Use RabbitMQ as broker: The examples use RabbitMQ via Docker for reliable message delivery
Environment isolation: Uses direnv for clean Python virtual environment management
Mac-specific configuration: Sets OBJC_DISABLE_INITIALIZE_FORK_SAFETY=YES for multithreading compatibility
Proper worker scaling: Configure concurrency based on workload characteristics

Performance Testing Approach

Comparative benchmarking: The repository demonstrates multiple approaches to the same problem for performance comparison
Real-world simulation: Uses a voter registration scenario with 10,000 records to test scalability
Timing analysis: Encourages comparing timestamps to measure actual performance differences

Key Takeaway

The repository emphasizes that Canvas primitives should be used judiciously - parallel execution isn't always faster, especially for database-intensive operations. The choice between sequential, parallel, or Canvas-based approaches should be based on the specific nature of your workload and performance testing results.

Celery gotchas

Async Reliability with Celery

Task Loss Prevention Between Web Process and Broker

Task Loss Prevention Between Broker and Worker

Worker Reliability Configuration

Deployment Safety

Alternative Approaches

Celery Canvas Best Practices

Workflow Patterns Demonstrated

Performance and Scalability Insights

Canvas Primitives Usage

Setup and Configuration Best Practices

Performance Testing Approach

Key Takeaway

References

Celery links & best practices:

Comments

More from this blog

Agentic Design Patterns

Simplify your life

50 things

Hugin

Design a key-value store

Command Palette

Async Reliability with Celery

Task Loss Prevention Between Web Process and Broker

Task Loss Prevention Between Broker and Worker

Worker Reliability Configuration

Deployment Safety

Alternative Approaches

Celery Canvas Best Practices

Workflow Patterns Demonstrated

Performance and Scalability Insights

Canvas Primitives Usage

Setup and Configuration Best Practices

Performance Testing Approach

Key Takeaway

References

Celery links & best practices:

Comments

More from this blog