Event Sourcing and CAP Requirements

One of the great things about CQRS is that, because all read responsibilities have been removed from the service layer, the mechanism for storage can be vastly different. We can continue to use a relational database but we also have the option to utilize alternative storage engines, such a document database or an object database.

From an application layer perspective, when we employ CQRS and utilize the event sourcing pattern we only require lookups based upon a unique key. We don't need to perform complex selects or joins or anything like that. This means that we only require a simple API from our persistent storage. Furthermore, because the event store when using event sourcing is append only, it eases some of our non-functional requirements.

The CAP Theorem states that between consistency, availability, and partition tolerance (which can generally be considered "scalability") we can only have two. A huge advantage of CQRS is that we are able to relax the consistency requirements of our read model in order to gain high availability and scalability. But what about the write model? We have very different requirements for our write model. Greg Young has already written a little bit about this, but I wanted to go into depth about the actual storage engines we could leverage for the write model for CQRS+event sourcing.

Inside the service layer (the transactional part of the system) we have exactly two requirements: consistency and high availability. We don't need partition tolerance (scalability). Wait a second…we don't need scalability?! Exactly. While we may want scalability for the system, we don't need the storage engine to provide us with scalability. When applying CQRS, we've broken up our system into business services and business components and we're (hopefully) utilizing asynchronous, one-way messaging, we've already partitioned our system. We've taken care of the partition tolerance at the application level. So we've done the work to break apart our system into many smaller pieces instead of one monolithic system.

Consistency and Availability

In the last few years there has been a surge in the number of storage engines, both relational and otherwise. During this time I have been looking into many of them, but it wasn't until I realized my exact requirements that I was able to narrow down the field.

While many popular document-oriented databases such as CouchDB and MongoDB are very, very appealing, they aren't able to provide the transactional guarantees (think ACID and synchronous replication) that we require for our domain model/transactional write model. Instead they are more BASE-oriented and focused. They relax consistency to provide availability and partition tolerance. Again, we're already partitioning our system ours manually along business service boundaries so we don't need that.

[UPDATE: The transactional guarantee I'm referring to is more of a distributed transaction between a message queue and our storage engine. I have written a post about how to work around this "deficiency" which enables us to utilize other storage engines.]

What options do I have? Well, there are a bunch. If you have identified any, please comment so I can append to this list:

Using and Abusing Relational Databases

For most every project we could actually use a relational database for the event store. That's crazy talk! If we've come all this way to break out from underneath the RDBMS mentality, why go running back? Well, we're not actually running back. We've come full circle, but we know the place for the first time. Instead of using an RDBMS by default, we've examined our exact requirements and now we're making informed, intelligent decisions about our needs versus what an RDBMS can offer us.

First we would like tool support. Check. Next we want support in our programming language of choice. Check. Solid vendor that won't disappear tomorrow? Check. What about lots of community help and support? Check. Free? Check.

The reason that so many are abandoning the traditional RDBMS product is to escape the impedance mismatch problem of object relational mapping and to gain the partition tolerance perspective at the expense of consistency. As mentioned previously, we don't have to worry about relationships when utilizing event sourcing. Further, we want consistency because we've already partitioned our system.

Don't Scare Your Operations Team

One of the other great things about using a relational database is that it won't scare your operations team (if you have one). There is strength in familiarity. Because many of the popular relational database products have a wealth of paid and community support, we have resource for when problems occur. With alternative storage engines you may be left on your own—and that's the last place you want to be when your system is having trouble in production.

Conclusion

So why not implement a simple transactional event store on top of a few database tables? Relational DBs can easily handle 1,000 writes/second—and that's per DB. We're allowed to have multiple instances. How many of your systems need 1,000 writes/second? Twitter does about 200 tweets/second. During the last presidential inauguration they spiked to about 5,000 tweets/second. We haven't even talked about sharding yet, e.g. vertically partitioning your domain such that users whose ID ends with 1-2 go to DB1, 3-4 go to DB2, etc.

If you have serious performance requirements and are concerned about latency and have really high (fast) SLAs on your system, you can easily swap out the relational event store and either write your own or utilize one of the above-mentioned storage engines, or vertically partition.

One last thought: Depending upon your domain, you may still be able to utilize MongoDB or some other eventually consistence event store for your transactional store if losing messages on occasion is okay in your particular domain. This decision can be made on a per business component-basis, so you can mix and match depending upon your specific requirements.