Event Sourcing: Backup and Archiving Strategies

One point that comes up now and again is related to archiving and backup strategies for an event store. The main thinking is that, if we keep all events ever generated, won't we run out of disk space? That's a great question and one that deserves a little bit more consideration.

First of all, one of the primary reasons for keeping everything relates to the incredible business value offered by the events. Imagine being able to replay the events and view them in a different way in order to extract new and interesting information. For example, looking for how long the average customer took to re-add a particular item to their cart after they removed it during the months of November and December. The business stakeholders didn't know they wanted to see that kind of data. But because we've got the events, we can easily create a report and run our events through it to populate it with data. In short, space is cheap, and data is valuable.

There are numerous other benefits to having an event store and keeping all of the events, but we want to more fully consider how we can backup our event store and then archive it, if necessary.

Backup

The great thing about an event store is that it's append only. This means that all events once committed are effectively immutable. One thing that's incredibly easy to backup and replicate is immutable data. In order to perform a backup, just grab every event that has been generated since your last backup. In essence, every backup becomes a differential backup.

Archiving

The concept of archiving is about taking events from expensive and fast storage and moving them to inexpensive, but highly replicated slow storage. In other words, the events remain online and available, but the latency is much higher. Because these archived events have been replicated to inexpensive, slower storage it is likely, depending upon how many replicas you have, that the events take up more storage than before.

As a rule of thumb, you can "archive" anything prior to the last snapshot + one hour—which gives all your view models plenty of time to get caught up with the latest version of the aggregate, such that we don't have to load events prior to the snapshot to determine if conflicts exists. (This concept alone deserves another post entirely).

The critical part in all of this is that the events remain queryable and available even when they're archived.