Messaging: At-Least-Once Delivery

One of the guarantees behind messaging is guaranteed delivery. Most of us don't give any additional thought to the subject. Until we get into production. While messages may be guaranteed deliverable, have we ever considered that a message may be delivered more than once?

When I first started seriously looking at and investigating messaging solutions a number of years ago one of the things that baffled me was how to ensure exactly-once delivery of messages. Virtually all messaging solutions provided at-least-once delivery. How could I prevent messages from being delivered more than once? And what about failure and failover of the message queuing infrastructure itself? Some of the cheaper or open source message infrastructure solutions that I investigated had some sort of quasi-failover, usually depending upon a shared drive as a single point of failure OR some asynchronous gossip protocol to let other peers know which messages had been delivered successfully.

Now we live in a more cloudy environment, with cloud-based infrastructure starting to occupy the minds of development and operations teams alike. Many of the messaging solutions such as Windows Azure Queues and Amazon's SQS continue to offer guaranteed delivery, but like their traditional, non-cloud counterparts, they only offer it in the from of at-least-once delivery.

In the end we have no way to prevent the infrastructure from delivering a message more than once. What can we do? As I have outlined in a previous post, we can apply a few idempotency patterns to make our application robust against the prospects of multiple delivery of messages.

One of the primary reasons, if not the primary reason, that messaging systems offer at-least-once delivery rather than exactly-once delivery is that we run into limitations as expressed in the CAP theorem—all "nodes" cannot possibly have a globally consistent snapshot and be available and partition tolerant. There is no unfailing "global brain".