Monthly Archives: February 2010

Event Sourcing and CAP Requirements

One of the great things about CQRS is that, because all read responsibilities have been removed from the service layer, the mechanism for storage can be vastly different.  We can continue to use a relational database but we also have the option to utilize alternative storage engines, such a document database or an object database.

From an application layer perspective, when we employ CQRS and utilize the event sourcing pattern we only require lookups based upon a unique key.  We don’t need to perform complex selects or joins or anything like that.  This means that we only require a simple API from our persistent storage.  Furthermore, because the event store when using event sourcing is append only, it eases some of our non-functional requirements.

The CAP Theorem states that between consistency, availability, and partition tolerance (which can generally be considered “scalability”) we can only have two.  A huge advantage of CQRS is that we are able to relax the consistency requirements of our read model in order to gain high availability and scalability.  But what about the write model?  We have very different requirements for our write model.  Greg Young has already written a little bit about this, but I wanted to go into depth about the actual storage engines we could leverage for the write model for CQRS+event sourcing.

Inside the service layer (the transactional part of the system) we have exactly two requirements: consistency and high availability.  We don’t need partition tolerance (scalability).  Wait a second…we don’t need scalability?!  Exactly.  While we may want scalability for the system, we don’t need the storage engine to provide us with scalability.  When applying CQRS, we’ve broken up our system into business services and business components and we’re (hopefully) utilizing asynchronous, one-way messaging, we’ve already partitioned our system.  We’ve taken care of the partition tolerance at the application level.  So we’ve done the work to break apart our system into many smaller pieces instead of one monolithic system.

Consistency and Availability

In the last few years there has been a surge in the number of storage engines, both relational and otherwise.  During this time I have been looking into many of them, but it wasn’t until I realized my exact requirements that I was able to narrow down the field.

While many popular document-oriented databases such as CouchDB and MongoDB are very, very appealing, they aren’t able to provide the transactional guarantees (think ACID and synchronous replication) that we require for our domain model/transactional write model.  Instead they are more BASE-oriented and focused.  They relax consistency to provide availability and partition tolerance.  Again, we’re already partitioning our system ours manually along business service boundaries so we don’t need that.

[UPDATE: The transactional guarantee I’m referring to is more of a distributed transaction between a message queue and our storage engine.  I have written a post about how to work around this “deficiency” which enables us to utilize other storage engines.]

What options do I have?  Well, there are a bunch.  If you have identified any, please comment so I can append to this list:

Using and Abusing Relational Databases

For most every project we could actually use a relational database for the event store.  That’s crazy talk!  If we’ve come all this way to break out from underneath the RDBMS mentality, why go running back?  Well, we’re not actually running back.  We’ve come full circle, but we know the place for the first time.  Instead of using an RDBMS by default, we’ve examined our exact requirements and now we’re making informed, intelligent decisions about our needs versus what an RDBMS can offer us.

First we would like tool support.  Check.  Next we want support in our programming language of choice.  Check.  Solid vendor that won’t disappear tomorrow?  Check.  What about lots of community help and support?  Check.  Free?  Check.

The reason that so many are abandoning the traditional RDBMS product is to escape the impedance mismatch problem of object relational mapping and to gain the partition tolerance perspective at the expense of consistency.  As mentioned previously, we don’t have to worry about relationships when utilizing event sourcing.  Further, we want consistency because we’ve already partitioned our system.

Don’t Scare Your Operations Team

One of the other great things about using a relational database is that it won’t scare your operations team (if you have one).  There is strength in familiarity.  Because many of the popular relational database products have a wealth of paid and community support, we have resource for when problems occur.  With alternative storage engines you may be left on your own—and that’s the last place you want to be when your system is having trouble in production.

Conclusion

So why not implement a simple transactional event store on top of a few database tables?  Relational DBs can easily handle 1,000 writes/second—and that’s per DB.  We’re allowed to have multiple instances.  How many of your systems need 1,000 writes/second?  Twitter does about 200 tweets/second.  During the last presidential inauguration they spiked to about 5,000 tweets/second.  We haven’t even talked about sharding yet, e.g. vertically partitioning your domain such that users whose ID ends with 1-2 go to DB1, 3-4 go to DB2, etc.

If you have serious performance requirements and are concerned about latency and have really high (fast) SLAs on your system, you can easily swap out the relational event store and either write your own or utilize one of the above-mentioned storage engines, or vertically partition.

One last thought: Depending upon your domain, you may still be able to utilize MongoDB or some other eventually consistence event store for your transactional store if losing messages on occasion is okay in your particular domain.  This decision can be made on a per business component-basis, so you can mix and match depending upon your specific requirements.

Event Sourcing and the Event Pipeline

Greg recently posted on the benefits of using event sourcing instead of traditional state-based persistence mechanisms.  It’s a great article and definitely worth reading multiple times.  But there was one paragraph in particular that caught my attention that gave me one of those great ah-ha moments.

One of those benefits is that we can avoid having to use a 2pc [two-phase commit] transaction between the data model and the message queue (if we are using one)… the reason for this is that the event storage itself is also a queue, we could be trailing the event storage to place the items on the queue (or directly use the event storage as a queue).

The reason this is so important is because it gives tremendous insight into the actual implementation of Greg’s architecture.  In an article from a few days ago, he said the following:

Firstly when one uses Event Sourcing with CQRS one can avoid the need for 2pc between the data store and the message queue that you are publishing messages to. This is because the Event Store can be used as both an event store and a queue. Conceptually the Event Store is an infinitely appending file. One can just have a read location that gets updated (chasing the end of the file) and presto you have a queue as well. A side benefit from this is that you also end up with only one disk write for both your storage and your durable queue, this may not initially seem like a big win but when you are trying to build performant systems this can help a lot!

I should mention that I understand the event store and its benefits, but the thing that I hadn’t considered is that we would publish the committed events in a separate transaction from the commit to the event store.

What’s really cool about this is that we can have another process watching the event store which can be publishing committed transactions to the bus so that our read model, or other services, can become aware of those updates.

There are a few additional insights that I’m digging out of the above paragraphs in light of what Greg has said in some of his early material.  I’ll be publishing more soon.

CQRS: An Introduction for Beginners

I had the chance to present CQRS to the local .NET user group this week.  The speaker had a last minute conflict so I volunteered to step up and speak.  I only had a few hours to prepare.  So what did I do?  I had previously watched several of Udi’s CQRS presentations—one in Australia the other in London.  I really like Udi’s style and the way he communicates his message, so I took (stole) the most basic elements and watered them down for a 30-minute presentation.  I take no credit for this—everything in my presentation can be attributed to both Udi Dahan and Greg Young.

I will most likely be presenting at an upcoming code camp as well.  If so, I’ll record that too, but this time I’ll be able to give things a bit more personal spin.

http://vimeo.com/moogaloop.swf?clip_id=9573973&server=vimeo.com&show_title=1&show_byline=1&show_portrait=0&color=&fullscreen=1

CQRS – An Overview for Beginners from Jonathan Oliver on Vimeo.

CQRS: An Introduction for Beginnershttp://static.slidesharecdn.com/swf/ssplayer2.swf?doc=cqrs-100219063528-phpapp02&stripped_title=cqrs-an-introduction-for-beginners

Domain Models, Aggregate Roots, and Lookup Tables

A problem that we experienced recently was that of dealing with and wrapping lookup tables.  By definition, a domain model is supposed to be isolated from the outside world.  It shouldn’t have any concept of data access or persistence.  Because a domain model is isolated and we don’t want to inject services or objects into our domain object, how do we go about querying lookup tables?

In our particular domain, we have about a dozen industry-mandated lookup tables that we query 30+ times in various ways.  Each lookup depends upon the previous one.  Normally a transaction script processing pattern would be ideal for this type of operation, but we have significant and complex logic surrounding each lookup and this logic varies depending upon shifting industry standards.  Complex logic that changes often…sounds like a great place for a domain model.  Except for the lookup tables.

The strongest possibility was to wrap the lookup tables in a domain service and use double dispatch by submitting the appropriate service to the aggregate.  Ever since Udi’s SOA course, I’ve lost my taste for double dispatch on domain objects.  Where does that leave us?

We were stuck until we re-read Udi’s Domain Events Salvation post.  The answer was to raise a domain event indicating that a certain step had been completed.  Then, another object, perhaps a domain service, would subscribe to that event and perform the appropriate lookup.  Once the lookup was complete it would raise another event (LookupPerformedEvent) which would contain the results of the lookup.  Finally, another event handler would subscribe the LookupPerformedEvent and route that information back into the domain.

The beauty of this mechanism is that the domain is blissfully unaware of the external service being provided by the lookup.  It simply knows that it gets the data that it needs and continues operating.  Furthermore, the domain has no concept of physical distribution or location of the lookup operations.  The lookup tables and lookups could be performed in process or they could be performed on a remote system where the various LookupPerformedEvent messages would then be routed back to the domain through the appropriate message handler.

Aggregate Roots and Shared Data

When working on a new domain model a few days ago, we ran into an interesting issue.  What happens when an aggregate has shared data?  You know, data that is common or shared between aggregate roots?  This is a question that also recently came up on the Domain Driven Design Yahoo Group, we had generally the same question but independent of the Yahoo Group.

For purposes of example, we will use the canonical sales/order model.  Let’s suppose that there is a business rule which states that unconfirmed accounts may only place a maximum of two orders which combined total no more than $100 in order to limit fraud.  It’s a simple enough business requirement, right?  But there is a small challenge, this particular business requirement seems to span multiple aggregate roots which each contain additional business rules surrounding the acceptance of an order.

So what do you do?  First of all, from a technical perspective, it can be solved several different ways.  But the reason that we’re working with a domain model is because we want the benefits of a domain model.  In order to get those benefits we have to follow the rules of a domain model.

The discussion on the Yahoo Group surrounded that of using double-dispatch techniques in order to retrieve the data needed by the aggregate root.  Ever since attending Udi’s SOA course in Austin last December, I generally dislike using double-dispatch techniques within the domain model.

Another simple technique is to have a shared entity object between the aggregates.  While it works and NHibernate will let you do this, it’s generally a bad idea because an aggregate is a unit of consistency.  It is atomic—all or nothing.  Having a shared entity violates that atomicity and autonomy of an aggregate.  While technically speaking we can do this, we probably shouldn’t as it violates one of the fundamental principles of DDD.

It may sound silly but at this point we were a little bit stuck.  When using event sourcing to rebuild your Order aggregate, it has no concept of the combined order total of all orders for a customer.  It only cares about itself and not other orders.  So how do we enforce the business requirement?  We don’t like the idea of double dispatch and we don’t want to have some kind of shared entity.  What can we do?

The answer is to introduce another aggregate—an OrderTotals aggregate.  When the PlaceOrderCommand is dispatched to system, it is handled by two aggregates within the service layer.  The first aggregate, OrderTotals, ensures that the command can be processed.  If not, it returns a fault indicating why.  If we receive a fault, we stop processing.  Otherwise, the Order aggregate receives the command and processes it.  When finished, it publishes the OrderAccepted event (to which the OrderTotals aggregate subscribes) and the transaction commits.

All in all, it’s fairly simple.  This idea applies beyond situations where event sourcing is being used to rebuild the domain entities by replaying the events.  It addresses the fundamental issue of how you deal with rules that span multiple aggregate roots.

CQRS Presentation Tonight

I will be giving a free presentation tonight at 6:00 PM on CQRS.  I’m going to give some historical background on CQRS and I will show how it resolves a lot of the pain and friction that developers experience when building systems.  Further, I’m going to talk about how it can be used to scale a system out to 10,000+ transactions per second.  Again, the event is free and there will be pizza provided.

Details:

February 17 at 6:00 PM

350 E 1175 S, Provo, UT

Send me a tweet if you have questions.