23 August 2012

CQRS - Lessons Learned

So I'm thinking about what I might do differently next time I write an enterprise web app of sufficient complexity. The CQRS principle is great, and provides a lot of clarity to the problems created by using domain objects for both reads and writes. However, many of the presented examples have extra architectural trappings that were not needed in my case: especially a message bus.

Here's what I think my next one might look like.

Javascript-based MVVM UI

No one reads my blog, but if you had, this seems contradictory to previous posts and the fact that I hate JavaScript. But hating JavaScript as a web developer proves about as useful as hating air as a human. I pretty much have to have it, so why not make my experience with it as positive as possible. I ultimately really enjoyed my JS-MVVM experiences, despite being out of my comfort zone (without Intellisense or type checking). The performance of the application was amazing.

My flavor of JS-MVVM was Kendo UI for the last project. It's still a relatively new framework, but it's fairly well-rounded for its age. It's biggest strength to me is that it's only one product for UI controls, MVVM, validation, etc. That's as opposed to integrating several products for the same purpose like Knockout.js, jQuery UI, Chosen, etc. Now what it has in breadth, it lacks in depth. The controls are great, but not as customizable as I would like in many cases. Documentation is lacking, but so far it has at least had enough hints to get me pointed in the right direction.

Messaging

I have really liked the clean separation that messages provide. Particularly, I like commands and events. They break up user intent and domain activity into bite-sized chunks. This has helped guide me to focus on, analyze, and encapsulate the business value of a feature, more than just it's technical requirements. These chunks are much easier to understand, and react to than in a traditional data-centric model where a lot of interpretation code (based on what fields on the record have changed) has to take place before you can get into the business value.

Partial Synchronicity

In my last implementation, command executions were synchronous up until events were published, then the integrations (event handlers) ran asynchronously. If any domain errors were generated during command execution, the dispatcher would catch them and return the exception message to the client. (Bear in mind these are domain exceptions like "An item cannot be it's own parent, except in sci-fi.") If there were no errors, the dispatcher returned success.

I will likely do the same thing next time. I really like being able to give the user immediate feedback about their actions. A lot of CQRS literature espouses asynchronous commands, but I find it is much better to give the user immediate results of a command where possible. So I take this approach first, and I will adjust if I run into scenarios where immediate results aren't feasible.

MVC Actions as Command Handlers

In the last project, I used a separate Command Handler project, consisting of simple objects with Handles() methods for the commands they handled. The UI would post the command to an MVC action, which would then call a dispatcher to find the handler and give it the command to handle.

Next time, I will instead try to setup MVC controller actions as my command handlers and cut out the dispatcher part. This would allow me to eliminate two bits of reflection code; a custom model binder, and a command dispatcher (router). The dispatcher does do a few things for me; logging the command, executing the validate routine, returning something to the client on error. But I can implement an abstract controller class to do those things.

I'm not entirely sure I will be able to make this work. I'm hoping I will be able to use the ActionSelectionAttribute to use overloaded methods as actions. Otherwise, each command will need a distinct URL that I will have to keep track of and give to the client; then all I've done is moved the command routing code up a level. (Not desirable.)

Update: Upon further reflection, I think this is a bad route to go as it mixes concerns between UI infrastructure and command handlers. I think I will keep doing what I was doing, but use the open command messages mentioned below so I don't have to instantiate them and set their properties with my custom reflection code. Instead, I can just call the default model binder to create the class after I figure out what it is from the type hint in the POST data.

If I did go this route, my command messages would have to be open objects (have default constructors and public setters) for the default model binder to be able to construct them. Currently, I am using reflection to get around the private setters, so I'm essentially using them as though they were open objects (with much reflection pain). But once they get in my architecture, they are immutable. But I'm not sure how much benefit this ends up being since their entire life span would be 1) being constructed in the model binder, and 2) have it's data used by the command handler. Private setters would keep the command handler code from changing anything, but ostensibly the command handler's only purpose is to use data from the command object to call a domain method. So, I'm still inclined to give open command messages a try next time.

Provide a JavaScript Command Library to Clients

Currently, I take the .NET commands, convert them to JavaScript, add a type-hint attribute, and provide them in a library to client apps. I don't like having to do this, but I really don't see away around it, since JavaScript is a loosely-typed language. I need some way of identifying the command that the UI is trying to send. Attributes alone are not enough, since two commands could have the same attributes, but trigger different business actions (e.g. DisableUser vs DeleteUser). I feel like this area is one that could still use some further exploration, because that's not really code that I want to maintain.

Event Integration

How to route events (integration) is another area that needs more exploration. What I did previous works rather well, but I feel that it could be streamlined a bit.

Again, I did not use a message bus, since I didn't foresee a need for external integrators. I put all the integration handlers in the same assembly (but different namespaces / folders), so I could use reflection to find them all and be able to route events to them.

Next time, I think I will have each integrator manually registered with the event publisher. The number of integrators is typically low, so the code maintenance would be negligible, and much clearer than a reflection-based solution. Then in the event publisher, when an event comes in, I can use a simple "is" check (e.g. if (integrator is IHandles)) to see if the integrator handles an event before trying to invoke the handler.

If I do need to provide an event stream to external integrators one day, I will have to break down and use a message bus. (I'd be leaning towards MassTransit / MSMQ in that case.) I might keep my current way of doing things, and then add a separate internal integrator which just repeats the message out to a message bus for outsiders. That way I could also control which messages are allowed out.

Possibly Sans Event Store

My last application was a brownfield scenario with an existing RDBMS, so I couldn't start with an event store. I can't get there until I migrate all of the business functionality into the new architecture. I'm not storing events, so I can't load from them. Since I am using events to drive integration, and I did want the load-from-event-store capability in the future, I setup the aggregates to take an array of events in the constructor as though there were an event store. Then, when the repository loads the aggregate, it reads the data from the database (which only has current state, and not the events) and makes a massive CreatedForLoad event to pass to the aggregate, which initializes all of the aggregate's properties from the event when applied. (As opposed to replaying many smaller, actual events that occurred to get the current state.)

I have mixed feelings on an event store going forward. I already want to use messaging and events, and the idea of an event store is absolutely grand. But what's currently out there to provide and support an event store is sorely lacking.

The main event store offering out there is Jonathan Oliver's EventStore. It is a great piece of software (and happens to be free / open source), but it lacks all but the most basic documentation (last I checked). I really didn't find it at all intuitive to setup. I did manage to get it working, but I felt that my knowledge of operating it was far too inadequate for the level of trust I wanted in a piece so integral to the application.

Supposedly there will be an event store offering from Greg Young next month, but I can't really say it will help the situation without seeing it. I guess we'll see.

The other problem with event sourcing is visualization of data. How do I look at information from the event store to triage/verify production issues? How do I achieve the advertised "going back in time" to previous states? There's no Management Studio for event stores (to my knowledge). At this point, you have to develop your own toolkit for this purpose. Therefore, these features have a non-trivial cost.

So the jury's still out on this one. I want to try it with a new app, but I have reservations.

UPDATE: Most of the stricken text above was based on impressions formed when I implemented CQRS+Messaging+ES for the first time in spike code, when it was too new to me to really grok. After implementing such an architecture, I can see that it fits in right where it should, and the config makes more sense. I will definitely use an event store next time.

Read Layer

I really don't have anything innovative on this front. I have an MVC action that basically executes a query to get a DataSet. It then converts the DataSet into a Dictionary and then serializes that with the built-in JavascriptSerializer and returns it as JSON to clients. (That way has the least hops to get from DataSet to JSON that I found.)

I don't anticipate this changing in a new application. Using some naming conventions, it's very quick to add new reads. The most time consuming part is developing a query. In a new application, I will probably have tables for each read model, so even that should be quick too.

That's all I can think of for now.

21 August 2012

CQRS Experiences So Far

So, I'm always looking for new tech, methodologies, anything that can help me do things better. The latest approach I've gone for is CQRS; that is Command-Query Responsibility Segregation.

The first hurdle is in figuring out exactly what CQRS is. As defined, it's pretty simple. Commands are on one object, Queries are on another. However, much of the information out there is inter-mixed with a lot of other architectural concepts and strategies that fit well with CQRS, like messaging and event sourcing. It gets pretty tough to see the big picture with so many concept details in the mix.

Eventually, I settled (perhaps wrongly) on a CQRS/ES/Messaging architecture for an enterprise web app. I say wrongly because the app is already very data-centric, and the users are accustomed to it being that way. However, the data-centric nature of the app had some major drawbacks, such as inscrutable business logic code, business logic being pushed to the UI (due to complex inter-relations and validation between fields on the record), and so forth.

The first hurdle I ran into was how to communicate commands and events. For this web app, a typical message bus (MSMQ or RabbitMQ) wasn't needed, at least not for now. Not only that, but this app is already a configuration nightmare that I didn't need to complicate further. I experimented with receiving commands through WCF (which I abandoned due to the difficulty of calling from Javascript), then .NET 4.5 Web API (which I abandoned because it's in beta and I ran into technical problems with it).

For commands, I settled on using vanilla MVC 3 controllers with a custom Model Binder to convert POST content to .NET command objects. (Custom because the MVC action argument is an interface, and the commands have neither a default constructor nor public setters. So the only way to instantiate the object is through reflection.) On the other side of the coin, I wrote a converter to turn all the .NET commands into Javascript objects so they were available in a Javascript library to the client and also had conversion hints (like the type's .NET full name). The library also includes some basic shortcut methods to execute commands, so the client doesn't manually call $.ajax(...). Instead, they call Command.Execute(myCommand) or Command.Validate(myCommand). I tried to use T4 to generate the library from the .NET command objects, but found it to be inadequate (due to locking the dll among other things). Instead, I have a separate controller which generates and returns the Javascript library when called. All this sounds complicated, but it actually wasn't that time consuming to develop. The time consuming part was researching and evaluating WCF, Web API, T4 before abandoning them.

For events, I also didn't need a message bus for now. Instead, I settled on an in-memory bus of sorts. Instead of the event handler applications subscribing to the event bus, I have an event publisher object that searches for event handlers using reflection (much the same way that my command dispatcher searches for command handlers), and adds their actions to the thread pool when an event is received.

I can't do full event sourcing just yet with this application since it has an existing database that is currently in use, and I'm not able to take 3 years to do a complete re-work. Instead, I have to integrate the new stuff while retaining the existing functionality of what hasn't yet been replaced. What I did do was in my domain repository, I convert the database information (in DataRow format) into an big Created event so that my domain objects don't have any special "load" code paths. The constructor just takes events as arguments to replay them. Hopefully that will make moving to full event sourcing seamless on the domain side.

The read model is also implemented as an MVC controller with get actions for each read model. I tried to do a WCF data service (odata), but our database was just too non-standard to generate an Entity Data Model without a lot of manual XML manipulation. And if I'm going to have to manually generate the data classes anyway, I didn't see the point of being constrained to whatever WCF data service requires / provides.

The UI has it's own ecosystem, since it uses HTML5 / Javascript and MVVM. But basically, it's job is to take the user's choices and make a command out of it. The UI uses various read models to guide the user's choices (e.g. to populate a drop-down list).

So here is an end-to-end look at what happens in this system.
  • The UI calls various read models and ultimately generates a command and sends it to the command API (via a jQuery $.ajax POST with the Javascript command object as the 'data' parameter).
  • The command API constructs the .NET command object from the POST data (using a custom model binder), then sends the command to the dispatcher.
  • The dispatcher looks up the handler for the given command and then calls it with the command
    • Note, the command also has a validate method which is called before the command handler is exercised. If the validate method returns an error, execution halts and the validation error it is passed back to the client. Otherwise command execution proceeds.
    • Note, domain exceptions are captured here and returned to the client as command errors with the domain exception message.
  • The command handler loads the necessary domain aggregate and calls the appropriate method(s) on the it.
  • The aggregate performs business logic and (perhaps) generates events.
  • The command handler polls the aggregate for events and then passes those events to the event dispatcher. (No event store to pass them to yet.)
  • The event dispatcher looks up the handlers for those events and puts them on the thread pool to run.
  • At this point, command execution returns to the client as successful since there were no domain exceptions along the way.
  • Example event handlers execution process:
    • Database
      • The database event handler runs. It handles ALL events. It looks for a SQL embedded resource file that matches the event name. If no resource file is found, the event is ignored.
      • If a matching resource is found, it's contents are loaded as the query string.
      • The event's properties are extracted as name/value pairs and used as parameters to the query.
      • The query is executed.
      • Note: In this way, the only thing necessary to add a database integration for new event is to add an new .sql embedded resource file to the solution with the event's name. The .sql file contains the query to run when the event is handled.
So, as you may notice, my commands are NOT asynchronous, fire and forget, or even queued. They return back to the UI either Success or Error with Message. My architectural choice was to be able to immediately get the command result back to the user with domain errors. This design lacks presently concurrency handling, but the existing system also lacks this and doesn't suffer much for it. Once a command executes successfully, the event handlers do operate asynchronously. At first I was concerned that the database writes wouldn't happen fast enough to display to the user on the next page hit, but this turned out not to be a problem so far.

So a couple of properties I have noted about this design. My first observation is that there are a lot of steps to add a feature (not even counting tests).
    • Generate the command and event
    • Add what's necessary to the read model to support the UI
      • (maybe) create queries
      • (maybe) add to read model
    • Add what's necessary to the UI to generate the command
      • Views
      • Models
      • UI Logic
    • Add the aggregate method for the domain logic
      • Add any needed value or entity objects
      • Extend repositories as needed
    • Add the aggregate Apply method for the event
    • Add the command handler method
    • Add the event handler methods / resources for all the integration pieces
"A lot of steps" seems like a pure disadvantage, but the disconnectedness of each program area means that multiple people truly can work on the same feature independently. For a lone developer doing features in a vertical slice, this design seems like a lot of ceremony and context switching. But for a team, I think it chops up the work in a horizontal fashion rather cleanly. The separation points seem to be around commands, events, and the read model. 

Since I am the lone developer on this for the moment, I've found that this design breaks most tasks up to the point where they are somewhat boring, and I end up copying and pasting a lot (especially with regard to commands and events). That could just be the subject matter, though. The interesting parts for me have been in the development of the architecture. The domain holds some challenge at times, but mainly the challenge there is in defining what is needed for the workflow. Sagas can be pretty interesting.

Performance is pretty good in my estimation. With a debug build, I am able to get about 1000 ops/s. Were I to batch operations on the same aggregate instead of loading it from the database for each command, that would probably improve. But at the moment I don't see a need to do that.