13 November 2013

CQRS Revisited

So, I have a project coming up that could really benefit from Messaging, CQRS, and Event Sourcing. In my first attempt at some of these things, I was going into a brownfield scenario that caused me to have to make a lot of unfavorable trade-offs. What follows is an attempt to work out the pieces for this new project in a greenfield scenario.

Things I'm settled on:

Command service

An MVC action receives the posted JSON and deserializes it into the appropriate .NET command. I carefully chose MVC action / JSON for its wide applicability. Almost any platform can send an HTTP POST.

I am considering hosting this as a Web API project in a Windows service to mitigate IIS configuration / maintenance. All the domain aggregate logic will live here. In the future, this could be partitionable by installing this on other servers and using a hashing function on the client, or by some other partitioning scheme.

So, once the command arrives to the command service, it needs to be passed to the appropriate handler. I have a convention-based message deliverer that I have written for this purpose and to make handler maintenance less of a chore. I call it LocalBus. Essentially, a handler only has to implement an interface (IMessageHandler). When instantiated, LocalBus looks for any class that implements this interface. Then it looks for any void Handles(? message) methods on the class, creates cached delegates, and maps them to message types. Then all you have to do is call LocalBus.Send(message) for single-handler messages or LocalBus.Publish(message) for multi-handler messages. It uses a thread/queue per handler to prevent concurrency problems and preserve ordering.

For the command side, I will only be using LocalBus.Send obviously. This will throw if there is not exactly 1 handler for a message.

The command handler will load the aggregate and call the appropriate method with the appropriate parameters for the command. At this point the domain can throw an error, which will get caught by the command service and be returned to the client. No events are saved in this case. If there is no error while running the aggregate method, then the handler will save the events, and the command service returns nothing.

As for the domain infrastructure, I am starting with CommonDomain. I'm using the interfaces almost as-is, but there are several things I am doing differently with base classes. One optimization that I started using with my last project was to have the aggregate's state as a separate class and have all the Apply methods on that state object. So I went ahead and built that into the AggregateBase. This also makes snapshot generation automatic (just return the state object). I've made even more changes to SagaBase, but more on that later.

Event Sourcing

This project can derive a lot of benefit from event sourcing. I am looking at GetEventStore.com for the event store. It is has a built-in subscription queue for event listeners (denormalizer, process managers, and external integrators). Its performance also looks to be quite good. The client is interested in a fail-over configuration, which it supports.

I plan on creating a template Windows Service for listeners that I can just reuse for different purposes. It will be responsible for remembering it's queue position by writing it to a file. So, it can be shut down and restarted to pick up where it left off. This also allows the queue position to be reset to the beginning (handy for denormalizers) by changing the file.

Denormalizers

I'm thinking of having 2 denormalizers initially, which will each be in separate Windows services. One is for operational data. That is, data which is used by views to produce commands. And another for reporting data. That way, I can make different deployment choices for these functions. For example, I can give the operational denormalizer higher priority since the timeliness of the data updates are more important. It gives me choices, anyway.

I will also use the LocalBus here to take the messages received from the event store and locally publish them to all interested handlers inside the denormalizer's app domain. They would then perform whatever steps needed to update their read models. I will probably have to setup the writes to notify when the data is actually written to disk in order to update the stream position.

Process Managers

In the case where there is logic that needs to be executed in response to events that happen, I will have another listener service (which subscribes to the event store). In that service, LocalBus will deliver to PM handlers, which will load and run its business process. The common example is a shipping workflow. The PM looks for OrderReceived and PaymentApplied before it sends the ShipOrder command for a given OrderId. For the most part, I'm modeling these process managers as aggregates in case there needs to be more logic than a simple state machine. I have some things I'm still working out here, which I will go over later.

External Integrators

For this project, I anticipate there being some external integrators who want to listen to the event stream and construct their own data models. There will have to be a new stream projection created and appropriate security setup for them. Then I can just give them a URL and the listener template that I have already been using for denormalizers and process managers.

UI

So far we're looking at WPF for internal and MVC or WebForms for external (depending on developer familiarity). I would probably do HTML5 if possible. The inputs and outputs of the UI are pretty simple. It takes in read model data and user interaction and converts that into commands to send to the command service. (Not that this translation is easy.)

Read Models / Read Layer

I'm looking at storing the operational models in CouchBase. I really like CouchBase for the way it caches things in memory, making reads and writes fast. Internal programs are likely to directly read from CouchBase for speed. However, I will eventually be setting up a Read Layer for other types of clients. This read layer will also likely be an MVC action. In order to generalize (not have to maintain) this read layer, I am considering having the action take the database and view as part of the URL, so getting the correct data is simply a matter of constructing the right URL. Security would still need to be maintained on the read layer.

Eventually, I anticipate needing a SQL database for ad-hoc and BI purposes.

Now for the stuff I'm unsure about:

Process Manager Issues

My initial thought on saving state of a PM is to just save the events that the process manager has seen. However, assuming I saved this in the event store, I would have to save to a separate event store instance, since they are copies of events that were already published by an aggregate stream. Or else, I would need to setup a partition to separate aggregate events from the PM events and have all listeners only listen for the aggregate partition. OR I could save state in a completely different way (different database). None of these options seem very appealing to me.

Then there is the issue of timeout logic (e.g. payment for the order is not received after 24 hrs, cancel the order). My initial thought is that I will have the PM handler listen for a timeout message and call the appropriate method on the PM. This part is no problem, since LocalBus can deliver an arbitrary message inside the AppDomain. One (solvable) problem I haven't yet worked out is how to position the timer in the logic. And there is the issue of storing the Timer's state with the same implications as PM storage. This seems like a good case for storing current state, since there are only 2 possible states of the message (delivered or not), and time stamps can be recorded for both so nothing is lost. So then do I introduce another kind of database (downvote from admins), store to file (downvote from developers), etc.?

I could hook into an external timer. But this is another configuration point (admin downvote). And I would have to host a comm stack of some kind on the PM service in order to receive a callback from the timer service, and secure it from other types of access. And then there's learning the ins and outs of the particular timer framework. Seems like overkill when the timer part would otherwise be pretty simple.

It appears that I am headed toward including some sort of embedded database with the PM service for PM and Timer state storage.

That's all I can think of for now.