Kasey's Blog

06 February 2014

No Goldilocks Serializers

So in any architecture that uses messaging for communication, one of the key pieces is serialization. Along those lines, I tried several different serializers and benchmarked the ones that would actually work in my scenario. I borrowed a lot of the serializer test code from various examples on stack overflow (e.g. this one).

Here are ones that I tried:

protobuf-net
Avro
MsgPack
fastJSON
XSockets (ServiceStack.Text)
JSON.NET

I didn't test ServiceStack on purpose, but instead was using the XSockets serializer, which turned out to just be ServiceStack. I didn't care to test ServiceStack at first because of its AGPL license, which is incompatible with my desired freedoms. The XSockets version of ServiceStack is licensed differently as part of the XSockets project. (Since they don't mention AGPL in their license page.)

Here are my requirements for messages I want to serialize:

No attributes (not even Serializable in my tests)
All public members are read only

These requirements killed Avro, MsgPack, and fastJSON right out of the gate. The .NET libraries for Avro and MsgPack wouldn't even instantiate a serializer. fastJSON crashed when trying to serialize. I didn't investigate why.

To get it to work with the XSockets JsonSerializer (which comes with XSockets Core on nuget and again is just ServiceStack.Text), I had to change my readonly fields to properties with private setters. Apparently it ignores fields, even public ones. Later, I figured out how to make it pick up fields, but it didn't really affect the test.

For both ServiceStack and JSON.Net, I serialized to/from an interface to force it to embed the type. This is important for my use case.

Protobuf-net doesn't support type embedding per se, so I had to send two messages. The first (header) message contained the type of the subsequent message. This test includes the time it took protobuf-net to serialize and deserialize both messages for fairness. I do know about the prefix in which I can send an integer to represent the message type, but try consistently hashing and arbitrary type to the same integer value and tell me how that works for you...

Protobuf-net has the added wrinkle that I have to dynamically add the classes to its serializer at runtime (remember, no attributes). This is a huge negative. It's not too bad for simple messages, but if you have any polymorphic properties, this code could get complicated. E.g. Find all inheritors for this type; Find all implementors of this interface; Oh, and if those objects also have reference type properties, recurse. It would be so much easier if I could just embed the type Marc... not that you will ever see this. I write this for myself, you know.

Ultimately, my conclusion for messaging is that there's not an absolute win for any one serializer. Nothing was "just right". Protobuf-net was the fastest, but the hardest to deal with. ServiceStack.Text is the fastest JSON serializer for my use case, but the AGPL license destroys too many of my freedoms. JSON.Net performed the poorest (but still light years ahead of the default .NET serializers). Deserialization was its weakness. For now I will try out the XSockets version of ServiceStack.Text since I'm planning on using it with XSockets anyway, and I can actually stomach the XSocket license it comes under (BSD 2-clause).

Below are the results for 100000 iterations.

protobuf-net (header + message)
Length: 285
Serialize: 213
Deserialize: 897

XSockets Serializer
Length: 508
Serialize: 933
Deserialize: 1606

JSON.Net BSON
Length: 404
Serialize: 1127
Deserialize: 3417

JSON.Net JSON String
Length: 540
Serialize: 928
Deserialize: 3644

Here are the classes used:

public class Header
{
public Type T { get; private set; }
public Header(Type t)
{
T = t;
}

}

public class TestMessage : IMessage
{
public Guid SenderId { get; private set; }
public Guid ReferenceId { get; private set; }
public Guid OrderId { get; private set; }
public Guid CustomerId { get; private set; }
public LineItem[] LineItems { get; private set; }
public DateTime OrderedOn { get; private set; }
public TestMessage(Guid senderId, Guid referenceId, Guid orderId, Guid customerId, LineItem[] lineItems, DateTime orderedOn)
{
SenderId = senderId;
ReferenceId = referenceId;
OrderId = orderId;
CustomerId = customerId;
LineItems = lineItems;
OrderedOn = orderedOn;
}
}

public class LineItem
{
public Guid ProductId { get; private set; }
public int Quantity { get; private set; }
public LineItem(Guid productId, int quantity)
{
ProductId = productId;
Quantity = quantity;
}

}

public interface IMessage { }

31 January 2014

Asynchronously Synchronous Commands

I have typically be against asynchronous commands for the simple fact that it leaves the user hanging. However, when considering responsiveness under load, there is something to be said for making this asynchronous.

My current line of thinking is to send the command asynchronously with the expectation that the result of that command (success for failure) will eventually be sent back to the client. That way, at least the client knows if there is a problem. This is what I'd call asynchronously synchronous.

This brings up a couple of interesting issues. One is what to do with the UI while that is going on. Do I show a spinner and make the user wait? The response will happen pretty quickly, so I'm leaning toward this initially. Maybe instead, I should just keep track of running commands and notify the user if one fails. How do they recover from the failure in this case? There are interesting opportunities for design here. Some of the right answer depends on the user's workflow.

So now the other issue is brought up here: Getting a positive command result doesn't mean that the read models have been updated, since this also happens asynchronously. This introduces the idea of the client being able to subscribe to read model changes. Ultimately, the user only wants one of two things to happen when they submit a command. 1) Ideally, the command succeeds and their view is updated (to verify). 2) Their command fails and they are given enough information to resolve the problem. Therefore the the only 2 things the client program will be interested in listening for are: command failure and read model updates.

That obviously doesn't cover some edge cases like network failure. After all if the network fails while I am blocking for it, then I get notified about it, but if I just never receive a message that I was expecting, then I need to account for that.

29 January 2014

CQRS and Actors

In looking through the DDD/CQRS google group, I came across a discussion about using the Actor model as the computational model for DDD/CQRS.

This immediately struck me as interesting, and even more so after listening to the episode of the Being the Worst podcast. Part of the overhead of CQRS/Messaging/ES, and a regular pain for me, is the whole handler piece. Most handler methods seem pretty boilerplate. Load something (e.g. aggregate or process manager), collect resources for it, do something with it, save it. All of this is pretty dull, and it must be written for every message. I think the actor model could have some good ideas to offer here.

The handler has been the catch-all for any infrastructure concerns related to delivering a message, but has typically been observed doing the above-mentioned 4 steps. Load and Save are exactly the same for every event sourced object, so these could be pushed higher in the infrastructure. Ignoring resources for now, the execute step could be characterized as executing a method on the aggregate with the message contents as parameters. The aggregate can be changed to just receive the message directly. This could be implemented similarly to the typical actor semantics of tell(message). There now just needs to be a way to provide resources to the aggregate at the infrastructure level, so the domain stays isolated per DDD.

So perhaps we should formalize the arrangement of there being a resource factory for a particular aggregate type. This factory could be directly used as needed by the aggregate via an interface and dependency injected by the infrastructure (not that you need a DI or IoC container for this). The resource factory is not exactly a repository, since there are no writes to it, only reads, and so there are still no persistence entanglements in the domain.

So now we have increased the burden of the infrastructure code. Namely, matching resources with actors. But we have eliminated most of the need for handlers. (You could also say that we increased the burden with load and save code, but this code was duplicated so much in handlers, that I see it as a net savings.) There may still be a case I can't think of for having handlers for one-off purposes, but now they don't have to be created perfunctorily in every case. We have also formalized another infrastructural role for servicing the domain, the Resource Factory. This role already existed informally in handlers.

I will be experimenting with this approach.

06 January 2014

Mass Effect: How Not to End a Series

To say the Mass Effect series was good is a drastic understatement in my opinion. The series as a whole is in fact one of the best I've ever played. However, when the folks at BioWare decided they were tired of making Mass Effect games, they took it out on us fans.

To set the stage for my utter disappointment, I must tell you that I played all the Mass Effect games, and played ME3 right after release. That was before the extended endings were released, although they had little to no effect in "fixing" anything.

First, I go through the hybrid ending (green), figuring it would be an optimal ending. After seeing it, it twinged as slightly evil because you were forcing a fundamental change on the entire universe that no one asked for. Then I do the "good" (blue) ending thinking it would be... well... good. However, the "good" ending realized the goal of a power-hungry villain and leads your character to not only sacrifice himself, but also his moral center. Also, the cut scenes were the EXACT SAME with minor animation differences and a different color explosion. So I thought, "oh the colors must just be reversed or something". So then I did the red ending, thinking it would be the good ending. It also had the EXACT SAME cut scenes with minor animation differences and a different color explosion. The red ending was also bad and a temporary solution, but it seemed the only ending to fit with themes from the previous titles in the series. If you have enough war assets, it's also the only ending with even a hint of the main character surviving. There's also a fail condition that you could call an ending that simply kills every one and resets technology back to zero. And none of the endings made any sense in context with the rest of the series. They were explained away in the extended endings, but still didn't really seem to fit the way the previous games ended. The fact remained that in ME3, BioWare had decided to take a different direction with the series (i.e. push it off a cliff).

So basically, the series ends on the main character only having selfish choices which also force him to die for completely arbitrary reasons (because that's the way the star kid designed it, and the main character suddenly lacks the capability to come up with independent solutions). And since BioWare seemingly intended to converge all the endings around you being a jerk and dying, they only bothered to make 1 ending cut scene, with minor animation tweaks and color changes between them. The extended ending didn't change that, it just added more filler.

All this leads me to believe the Mass Effect team was tired of making Mass Effect games and sabotaged the ending. This way, they theoretically have less fans begging for them to make a sequel (the main character died and ended up being a tool anyway). However, this hit me as a betrayal by the characters and universe I had come to love. The "mass effect" for me has been to no longer be a fan of BioWare. Their games have some of the greatest story, which makes the series-ending betrayal all the more bitter.

13 November 2013

CQRS Revisited

So, I have a project coming up that could really benefit from Messaging, CQRS, and Event Sourcing. In my first attempt at some of these things, I was going into a brownfield scenario that caused me to have to make a lot of unfavorable trade-offs. What follows is an attempt to work out the pieces for this new project in a greenfield scenario.

Things I'm settled on:

Command service

An MVC action receives the posted JSON and deserializes it into the appropriate .NET command. I carefully chose MVC action / JSON for its wide applicability. Almost any platform can send an HTTP POST.

I am considering hosting this as a Web API project in a Windows service to mitigate IIS configuration / maintenance. All the domain aggregate logic will live here. In the future, this could be partitionable by installing this on other servers and using a hashing function on the client, or by some other partitioning scheme.

So, once the command arrives to the command service, it needs to be passed to the appropriate handler. I have a convention-based message deliverer that I have written for this purpose and to make handler maintenance less of a chore. I call it LocalBus. Essentially, a handler only has to implement an interface (IMessageHandler). When instantiated, LocalBus looks for any class that implements this interface. Then it looks for any void Handles(? message) methods on the class, creates cached delegates, and maps them to message types. Then all you have to do is call LocalBus.Send(message) for single-handler messages or LocalBus.Publish(message) for multi-handler messages. It uses a thread/queue per handler to prevent concurrency problems and preserve ordering.

For the command side, I will only be using LocalBus.Send obviously. This will throw if there is not exactly 1 handler for a message.

The command handler will load the aggregate and call the appropriate method with the appropriate parameters for the command. At this point the domain can throw an error, which will get caught by the command service and be returned to the client. No events are saved in this case. If there is no error while running the aggregate method, then the handler will save the events, and the command service returns nothing.

As for the domain infrastructure, I am starting with CommonDomain. I'm using the interfaces almost as-is, but there are several things I am doing differently with base classes. One optimization that I started using with my last project was to have the aggregate's state as a separate class and have all the Apply methods on that state object. So I went ahead and built that into the AggregateBase. This also makes snapshot generation automatic (just return the state object). I've made even more changes to SagaBase, but more on that later.

Event Sourcing

This project can derive a lot of benefit from event sourcing. I am looking at GetEventStore.com for the event store. It is has a built-in subscription queue for event listeners (denormalizer, process managers, and external integrators). Its performance also looks to be quite good. The client is interested in a fail-over configuration, which it supports.

I plan on creating a template Windows Service for listeners that I can just reuse for different purposes. It will be responsible for remembering it's queue position by writing it to a file. So, it can be shut down and restarted to pick up where it left off. This also allows the queue position to be reset to the beginning (handy for denormalizers) by changing the file.

Denormalizers

I'm thinking of having 2 denormalizers initially, which will each be in separate Windows services. One is for operational data. That is, data which is used by views to produce commands. And another for reporting data. That way, I can make different deployment choices for these functions. For example, I can give the operational denormalizer higher priority since the timeliness of the data updates are more important. It gives me choices, anyway.

I will also use the LocalBus here to take the messages received from the event store and locally publish them to all interested handlers inside the denormalizer's app domain. They would then perform whatever steps needed to update their read models. I will probably have to setup the writes to notify when the data is actually written to disk in order to update the stream position.

Process Managers

In the case where there is logic that needs to be executed in response to events that happen, I will have another listener service (which subscribes to the event store). In that service, LocalBus will deliver to PM handlers, which will load and run its business process. The common example is a shipping workflow. The PM looks for OrderReceived and PaymentApplied before it sends the ShipOrder command for a given OrderId. For the most part, I'm modeling these process managers as aggregates in case there needs to be more logic than a simple state machine. I have some things I'm still working out here, which I will go over later.

External Integrators

For this project, I anticipate there being some external integrators who want to listen to the event stream and construct their own data models. There will have to be a new stream projection created and appropriate security setup for them. Then I can just give them a URL and the listener template that I have already been using for denormalizers and process managers.

UI

So far we're looking at WPF for internal and MVC or WebForms for external (depending on developer familiarity). I would probably do HTML5 if possible. The inputs and outputs of the UI are pretty simple. It takes in read model data and user interaction and converts that into commands to send to the command service. (Not that this translation is easy.)

Read Models / Read Layer

I'm looking at storing the operational models in CouchBase. I really like CouchBase for the way it caches things in memory, making reads and writes fast. Internal programs are likely to directly read from CouchBase for speed. However, I will eventually be setting up a Read Layer for other types of clients. This read layer will also likely be an MVC action. In order to generalize (not have to maintain) this read layer, I am considering having the action take the database and view as part of the URL, so getting the correct data is simply a matter of constructing the right URL. Security would still need to be maintained on the read layer.

Eventually, I anticipate needing a SQL database for ad-hoc and BI purposes.

Now for the stuff I'm unsure about:

Process Manager Issues

My initial thought on saving state of a PM is to just save the events that the process manager has seen. However, assuming I saved this in the event store, I would have to save to a separate event store instance, since they are copies of events that were already published by an aggregate stream. Or else, I would need to setup a partition to separate aggregate events from the PM events and have all listeners only listen for the aggregate partition. OR I could save state in a completely different way (different database). None of these options seem very appealing to me.

Then there is the issue of timeout logic (e.g. payment for the order is not received after 24 hrs, cancel the order). My initial thought is that I will have the PM handler listen for a timeout message and call the appropriate method on the PM. This part is no problem, since LocalBus can deliver an arbitrary message inside the AppDomain. One (solvable) problem I haven't yet worked out is how to position the timer in the logic. And there is the issue of storing the Timer's state with the same implications as PM storage. This seems like a good case for storing current state, since there are only 2 possible states of the message (delivered or not), and time stamps can be recorded for both so nothing is lost. So then do I introduce another kind of database (downvote from admins), store to file (downvote from developers), etc.?

I could hook into an external timer. But this is another configuration point (admin downvote). And I would have to host a comm stack of some kind on the PM service in order to receive a callback from the timer service, and secure it from other types of access. And then there's learning the ins and outs of the particular timer framework. Seems like overkill when the timer part would otherwise be pretty simple.

It appears that I am headed toward including some sort of embedded database with the PM service for PM and Timer state storage.

That's all I can think of for now.

23 April 2013

X3:TC X-TREME Trader

I am finding the X-treme Trading achievement to be the worst one so far. It's basically holding me up from starting on the Dead is Dead playthroughs. What follows are my spoilers for getting through that achievement.

Firstly, I tried stations + CAGs and some CLS routes. These are very profitable, and this got me up to Tycoon at a slow but reasonable clip. Then I hit Tycoon, and everything slowed down drastically. As in 5-6 hrs of game time (about a half hour total in SETA) to get 1%. This is far too slow. The big problem with CAGs is that increasing profitability means expanding your stations, which is a pain to me, and it causes you to drop out of SETA a lot. CLSs also require a bit of fiddling to get just right (for me). So UTs ended up coming to the rescue. Setup is a bit quicker and easier. Every once in a while, they turn stupid and stop working, but not too often.

The other key to this is to build, build, build... for the Yaki. Repair your rep with them to the point where you can take missions. Then go to Senator's Badlands or Weaver's Tempest and look for build missions, green plus icon. (In my game, there were no stations in Ocracoke's Storm by the time I went there.) Start doing a SETA cycle -- see previous post about using a timer and SETA to avoid rank loss -- with the local map open and keep an eye out for more build missions. If you have bad rep with any of the races, make sure you can buy the factory they are requesting. :) Do this continually as you train up UTs. (more on UTs below). Over time, you will end up building lots of stations in Yaki space... and you are the only one who can trade with them!

For training UTs (one at a time), I typically start the ship as a Local Trader in Ore Belt. This has been a pretty consistent training ground for me. It seems to get the trader up to UT-capable in 5 or so cycles. After the trader hits level 8 or higher, I'll send them to one of two places. If Yaki space is not very built up with stations, I send them to Power Circle. They will gradually level up and eventually won't return to Power Circle much. However, after Yaki space is populated with stations, I send the Level 8s to Empire's Edge. From there they are in range of the Yaki sectors and they train up to max quickly. With about 30 UTs, it's taking about 2 SETA cycles to get 1%. Up from 5 or 6 cycles with just the CAGs (for food and secondaries) and CLSs.

As for which ships to use for UTs, I have mainly been using Split Caiman SF XLs (10k cargo, 89 m/s). Mistrals are bigger, but are also a lot slower. In case I need to retask them to do something else, I prefer the caimans. I buy them 10 at a time, and send them through the outfitting gauntlet.

Purchase Large versions from Zyarth's Dominion
I have the hub connected to the neighboring system, so it's convenient to buy from there.
I purchase the L version with shields already equipped if possible.
Caution must be exercised because there are frequent Xenon attacks in Zyarth's
Send to Terracorp HQ in Home of Light for Jump drive (and all other upgrades)
One at a time, set all ships to Autojump: Yes, Minimum jumps: 0, Refuel amount: 50 jumps
Send them to the nearest SPP or one of my factories to get fueled
Send them to an equipment dock, and get all upgrades (Split if you want ALL the upgrades, including turbo, carrier command, spacefly collector, but not needed)
Send them to OTAS HQ in Legend's Home for Docking Computer and Triplex
Park them in a protected system until they are needed

Last time, I bought 80 of them and did this all in a row. Since 80 is too many to put in a station, I parked them in space in my home system. Then I would send them 5 at a time to Ore Belt so I always had some in place to start the Local Trader command when the trainee graduated to UT.

This takes a while, and it would be nice to find faster ways to train UTs. But it works, and the UTs + Yaki station building has drastically upped my trade rank earnings. I don't claim this is the best way to do it, but it's working for me.

09 April 2013

Searching for the perfect small business server

One of the big challenges for small businesses, especially service-based organizations, is a server infrastructure that is both resilient of failures and inexpensive. Small businesses can't typically afford to shell out the cash for a SAN and high availability servers. Yet they still need their servers to operate with a high level of reliability.

This post will attempt to describe one solution that I have been designing, and why each choice was made.

Base Computer: Mac Mini (quad-core)
System Drive: External storage in Hardware RAID-1
Backup Drives: Internal Hard Drive, 2x USB Drives (swapped each day, one taken home)
Virtual Machine Software: ???
Virtual Server: Windows Server

So what does this complicated setup gain me?

Mac server + System Drive on external storage
In my testing months ago, it was possible to install OS X onto an external drive and boot from it. Then, I could actually power down the computer, take the external drive and plug it into another (different!) Mac and boot that from the external drive. The original system came up on the different hardware like nothing happened.

This is a great hardware failure recovery story. Say the power supply burns out in your server... Just grab any other (Intel-based) mac, plug in the external storage, hold shift while booting, boot from the external storage, and the server is back up! No sophisticated expertise required.

This is just not possible in Windows or Linux.

The RAID inclusion is to address the fact that hard drives fail pretty often compared to other parts of the system. With a mirror, you can lose a drive without taking the server down. Otherwise, this common failure would result in a "restore from backup" situation. Hardware RAID-1, 2-bay enclosures are not all that expensive (~$200) compared to the cost of unexpected downtime during critical business hours.

Backup drives
Backing up using the built-in Time Machine software. The simple reason for using the internal hard drive as a backup is because it's already there anyway. The 2 USB drive setup allows you to swap out the backup each day so that you can take a backup offsite after hours. Technically, both backups are actually onsite most of the time: one plugged in, and one in your car or on your desk so you can remember to swap it. So if it really bothers you that your backups are onsite most of the time, then you can even go to 3 USB backup drives. You can never have too many, really.

Virtual Server
Let's face it; OS X Server has had some really mixed reviews. On top of that, OS X server might not support the apps that you typically run (e.g. ASP.NET). So why choose? Use Virtual Machine software to run the server you need on top of OS X.

To make this setup work, all data files should actually be shared from the host OS (Mac OS X) so they are backed up by the host OS, and are not internal to the VM. Then the VM simply uses the shared folder from the host OS for server functions (file sharing, web serving, database backups). So the data is automatically backed up by time machine, and not a backup within a VM backup situation.

What about the VM itself? Since the data is hosted external the VM, the VM ends up just being valuable for it's server configuration. Since the VM file will change very often while running, it should be excluded from backup. (Copying a multi-gigabyte file every hour will eat up your Time Machine backup space quickly.) When there is a configuration change, the VM should be copied (probably offline) to a folder that is backed up, so the server configuration is backed up.

The complete restore process ends up being: Restore Time Machine backup to new Mac. Copy VM from backup location to correct location so it can be run. Done.

Another advantage to having the server run in a VM is that you have remote administration capabilities (through the host OS) that ordinarily cost a lot of money to implement with real servers. The main things I'm thinking of there are remote power on/off, booting in a recovery mode, inserting CDs (by connecting ISO images as drives), etc. These things are extremely convenient to be able to do remotely.

VM Software
So the main reason this is still a work in progress is because I have not yet picked out VM software to make this work. My primary low/no-cost candidates are VMWare Fusion and VirtualBox; both of which can be run headless and scripted to start on computer startup. But, there are still wrinkles to iron out. For instance, I'm not sure if folders that come from the host OS can be reshared as Windows file shares. I'll probably have to adjust the design based on experimentation.

Conclusion
This is a work in progress. I still have things to figure out. However, I believe this kind of setup would create a really compelling story for small businesses who require low and remote maintenance with a relatively inexpensive server for it's level of resilience (including data backups).