21 August 2012

CQRS Experiences So Far

So, I'm always looking for new tech, methodologies, anything that can help me do things better. The latest approach I've gone for is CQRS; that is Command-Query Responsibility Segregation.

The first hurdle is in figuring out exactly what CQRS is. As defined, it's pretty simple. Commands are on one object, Queries are on another. However, much of the information out there is inter-mixed with a lot of other architectural concepts and strategies that fit well with CQRS, like messaging and event sourcing. It gets pretty tough to see the big picture with so many concept details in the mix.

Eventually, I settled (perhaps wrongly) on a CQRS/ES/Messaging architecture for an enterprise web app. I say wrongly because the app is already very data-centric, and the users are accustomed to it being that way. However, the data-centric nature of the app had some major drawbacks, such as inscrutable business logic code, business logic being pushed to the UI (due to complex inter-relations and validation between fields on the record), and so forth.

The first hurdle I ran into was how to communicate commands and events. For this web app, a typical message bus (MSMQ or RabbitMQ) wasn't needed, at least not for now. Not only that, but this app is already a configuration nightmare that I didn't need to complicate further. I experimented with receiving commands through WCF (which I abandoned due to the difficulty of calling from Javascript), then .NET 4.5 Web API (which I abandoned because it's in beta and I ran into technical problems with it).

For commands, I settled on using vanilla MVC 3 controllers with a custom Model Binder to convert POST content to .NET command objects. (Custom because the MVC action argument is an interface, and the commands have neither a default constructor nor public setters. So the only way to instantiate the object is through reflection.) On the other side of the coin, I wrote a converter to turn all the .NET commands into Javascript objects so they were available in a Javascript library to the client and also had conversion hints (like the type's .NET full name). The library also includes some basic shortcut methods to execute commands, so the client doesn't manually call $.ajax(...). Instead, they call Command.Execute(myCommand) or Command.Validate(myCommand). I tried to use T4 to generate the library from the .NET command objects, but found it to be inadequate (due to locking the dll among other things). Instead, I have a separate controller which generates and returns the Javascript library when called. All this sounds complicated, but it actually wasn't that time consuming to develop. The time consuming part was researching and evaluating WCF, Web API, T4 before abandoning them.

For events, I also didn't need a message bus for now. Instead, I settled on an in-memory bus of sorts. Instead of the event handler applications subscribing to the event bus, I have an event publisher object that searches for event handlers using reflection (much the same way that my command dispatcher searches for command handlers), and adds their actions to the thread pool when an event is received.

I can't do full event sourcing just yet with this application since it has an existing database that is currently in use, and I'm not able to take 3 years to do a complete re-work. Instead, I have to integrate the new stuff while retaining the existing functionality of what hasn't yet been replaced. What I did do was in my domain repository, I convert the database information (in DataRow format) into an big Created event so that my domain objects don't have any special "load" code paths. The constructor just takes events as arguments to replay them. Hopefully that will make moving to full event sourcing seamless on the domain side.

The read model is also implemented as an MVC controller with get actions for each read model. I tried to do a WCF data service (odata), but our database was just too non-standard to generate an Entity Data Model without a lot of manual XML manipulation. And if I'm going to have to manually generate the data classes anyway, I didn't see the point of being constrained to whatever WCF data service requires / provides.

The UI has it's own ecosystem, since it uses HTML5 / Javascript and MVVM. But basically, it's job is to take the user's choices and make a command out of it. The UI uses various read models to guide the user's choices (e.g. to populate a drop-down list).

So here is an end-to-end look at what happens in this system.
  • The UI calls various read models and ultimately generates a command and sends it to the command API (via a jQuery $.ajax POST with the Javascript command object as the 'data' parameter).
  • The command API constructs the .NET command object from the POST data (using a custom model binder), then sends the command to the dispatcher.
  • The dispatcher looks up the handler for the given command and then calls it with the command
    • Note, the command also has a validate method which is called before the command handler is exercised. If the validate method returns an error, execution halts and the validation error it is passed back to the client. Otherwise command execution proceeds.
    • Note, domain exceptions are captured here and returned to the client as command errors with the domain exception message.
  • The command handler loads the necessary domain aggregate and calls the appropriate method(s) on the it.
  • The aggregate performs business logic and (perhaps) generates events.
  • The command handler polls the aggregate for events and then passes those events to the event dispatcher. (No event store to pass them to yet.)
  • The event dispatcher looks up the handlers for those events and puts them on the thread pool to run.
  • At this point, command execution returns to the client as successful since there were no domain exceptions along the way.
  • Example event handlers execution process:
    • Database
      • The database event handler runs. It handles ALL events. It looks for a SQL embedded resource file that matches the event name. If no resource file is found, the event is ignored.
      • If a matching resource is found, it's contents are loaded as the query string.
      • The event's properties are extracted as name/value pairs and used as parameters to the query.
      • The query is executed.
      • Note: In this way, the only thing necessary to add a database integration for new event is to add an new .sql embedded resource file to the solution with the event's name. The .sql file contains the query to run when the event is handled.
So, as you may notice, my commands are NOT asynchronous, fire and forget, or even queued. They return back to the UI either Success or Error with Message. My architectural choice was to be able to immediately get the command result back to the user with domain errors. This design lacks presently concurrency handling, but the existing system also lacks this and doesn't suffer much for it. Once a command executes successfully, the event handlers do operate asynchronously. At first I was concerned that the database writes wouldn't happen fast enough to display to the user on the next page hit, but this turned out not to be a problem so far.

So a couple of properties I have noted about this design. My first observation is that there are a lot of steps to add a feature (not even counting tests).
    • Generate the command and event
    • Add what's necessary to the read model to support the UI
      • (maybe) create queries
      • (maybe) add to read model
    • Add what's necessary to the UI to generate the command
      • Views
      • Models
      • UI Logic
    • Add the aggregate method for the domain logic
      • Add any needed value or entity objects
      • Extend repositories as needed
    • Add the aggregate Apply method for the event
    • Add the command handler method
    • Add the event handler methods / resources for all the integration pieces
"A lot of steps" seems like a pure disadvantage, but the disconnectedness of each program area means that multiple people truly can work on the same feature independently. For a lone developer doing features in a vertical slice, this design seems like a lot of ceremony and context switching. But for a team, I think it chops up the work in a horizontal fashion rather cleanly. The separation points seem to be around commands, events, and the read model. 

Since I am the lone developer on this for the moment, I've found that this design breaks most tasks up to the point where they are somewhat boring, and I end up copying and pasting a lot (especially with regard to commands and events). That could just be the subject matter, though. The interesting parts for me have been in the development of the architecture. The domain holds some challenge at times, but mainly the challenge there is in defining what is needed for the workflow. Sagas can be pretty interesting.

Performance is pretty good in my estimation. With a debug build, I am able to get about 1000 ops/s. Were I to batch operations on the same aggregate instead of loading it from the database for each command, that would probably improve. But at the moment I don't see a need to do that.

No comments: