
▲ 32 r/django
Most Celery tutorials cover the basics, but they rarely mention what can go wrong when publishing a message.
A common pattern I've seen across teams: a task gets queued, something silently fails on the publishing side, and the debugging session starts with no traces and no clear recovery process.
After running into these issues repeatedly, I mapped out six stages of reliability for Celery/RabbitMQ setups:
- Best Effort: fire-and-forget, at-most-once delivery, tasks can vanish silently
- Transactional Boundary: wrapping commands in atomic transactions to prevent out-of-sync data
- Publishing on Commit: using
delay_on_commitso tasks aren't queued before the transaction succeeds - Publisher Confirms: getting actual confirmation that the broker received and persisted the message
- Outbox Pattern: persisting intent to the database first, dispatching later, giving you at-least-once delivery
- Clusters and Quorum Queues: replication strategies and where classic queues can still lose messages
Full write-up here if useful: https://vladogir.substack.com/p/your-background-tasks-are-silently?r=157avd
Next, I plan to cover the consumer side: idempotency, monitoring and observability.
What do you think I missed?
u/Fragrant_Brush_4161 — 3 days ago