Kasey's Blog: September 2012

22 September 2012

Simplifying Message Handlers

One thing I don't like about the message handler examples I've seen are all the interfaces that you have to implement. For instance:

public class CustomerHandlers :
IHandles<ConvertLeadToCustomer>,
IHandles<CustomerCreditLine>,
IHandles<CorrectCustomerAddress>
.... // lots of these
{
public void Handle(ConvertLeadToCustomer message)
{
...
}

... // lots of these also, but they actually do stuff
}

These interfaces help you to match up messages with the handler method and gives you something to cast the handler to in order to call the appropriate method. Ultimately it ends up like this:

((IHandles<T>)handler).Handle((T)message);

The first alternative I discovered was to use dynamic. I didn't have to implement all the interfaces, maybe just one interface on the parent class, and let the methods document for themselves the messages they handle. Assuming I know the right method exists on the handler (due to reflection), I can let the DLR figure out how to actually call it:

((dynamic)handler).Handle((dynamic)message);

Note that 3 calls to the DLR are actually made. One for the message, one for the handler, and one for the method call. This works, but you run into problems if you are lazy like me and have some event handlers that handle all events. For that I use a shortcut syntax.

public class DatabaseDenormalizer: IHandles<IEvent>
{
public void Handle(IEvent message)
{
// get message's actual type name
// call a stored procedure with same name (if it exists)
// using events properties as parameters
}
}

In that case, dynamic wouldn't work if you had both a generic handler method and a specific one. The generic one would never get called, because the DLR always goes for the most specific call. Also, dynamic has a bit of overhead as compared to the direct method call (but not anywhere near the slowness of a MethodInfo.Invoke() call).

Edit: Correction, MethodInfo.Invoke is only "slow" in simple tests. When the call actually does some work (and is in Release mode), Invoke can be just as fast as a direct method call.

So being the crazy person that I am, I kept looking for an alternative where I could minimally decorate my message handlers, but still have decent performance. I want them to look like this:

public class CustomerHandler : IMessageHandler
{
public void Handle(ConvertLeadToCustomer message)
{
...
}

public void Handle(RequestCustomerCreditLine message)
{
...
}
}

So after googling around for information many times before, I finally hit the magic combination of words today to bring me to this post from 4 years ago by Jon Skeet. The last code snippet (with some tweaking) pretty much solved my conundrum. It's admittedly pretty complex code, but I'm willing to accept that for improved performance, and easier setup on my objects, plus the complexity is on code that I will likely never touch again.

19 September 2012

Asynchronous Programming

I'm always forgetting the link to this resource on multi-threaded programming, so I thought I would post it here. It's really a great resource if you're getting started with multi-threaded programming in .NET or you just want to brush up on the nuances.

http://www.albahari.com/threading/

18 September 2012

My Next Software Architecture

I've been thinking more about how to architect my next web-based software project. I'll be honest with you, I'm not a pro at this yet, but I'm trying to figure it out and get some experience. So I'm going to bounce some ideas off of you, Internet, as well as work out some of my thought processes. Here are some choices that I have in mind. Note that these things have been around a while, and I've played with them a little, but it's fairly new to me.

Profile and Strategy

Non-distributed

Pretty much all of the web apps I write manage a business's internal workings and are not distributed (in the large-scale sense), or are at most distributed to a few satellite locations. (This has historically been handled by file replication, database replication, and private networks between locations.) Therefore, I'm not going to add extra trappings that require a lot of configuration or overhead. For example, I won't be using a durable message queue. The core of the system will be running in-memory, and the components will mostly communicate in-memory.

CQRS

Earlier in my career, I thought it was a good idea to directly use business entities for UI views. That ends up leading to your domain objects being bloated with some UI-only concerns and vice versa. The common alternative is to create different view models as projections of your live business entities. But in order to do that, you have to load your business objects and map them to view objects. So this ends up with a lot of mapping code maintenance, and the mappings can get complicated.

The CQRS strategy keeps two sets of data updated; the domain (business) object data, and the view model data. The benefit here is that each can evolve at their own pace without greatly affecting the other. It's also a bit faster because it is no longer necessary to load the domain object first -- you just load the data straight from database to client. The downside is that you have to update 2 (or more) sets of data. But overall, it eases the complexity of trying to use one set of object for 2 distinct purposes.

DDD

My problem domains tend to be, on average, moderately complex because I'm representing internal processes of a business. DDD is meant to address complexity, but it's more about behaviors and communication patterns than code patterns. Probably the one code pattern to take away is to structure the domain objects in the same way and with the same names that customers use to describe their processes. That typically means insulating the domain objects from view and persistence concerns, and let it focus on business. If not doing CQRS also, you end up with the mapping code issue.

Messaging

Using messaging brings additional overhead to the project. (As compared to direct method calls on domain objects.) But it decouples the clients from the domain and generally just bring options to the table. In my case, I'm leaning more towards a completely HTML/Javascript-based UI, so direct method calls into .NET are not an option anyway. I could write MVC actions which are coupled to domain methods, but in that case it's about as much overhead to implement as messaging (just a different kind of overhead), and the coupling still causes ripple effects and interface maintenance.

Specific Tactics/Tech

Event Sourcing

This tactic lets you represent your domain objects in persistent storage as a series of events, which are basically just classes with the appropriate contextual information. Examples: CustomerCreatedEvent, CustomerAddressCorrectedEvent, CustomerCreditLineRequestedEvent, etc. You can imagine all of these events having a customer id. The address event, you can imagine having properties related to address information.

This has a number of advantages including trace-ability, replay-ability, easy and performant persistence story. The main disadvantage I see to this is that some of the messaging concerns (events, specifically) end up leaking into your domain logic. However, anything message-related done by your domain objects is usually very simple (plain assignment statements).

But event sourcing also opens the possibility of using non-relational databases. For me, it is usually the case that the domain deals with relationships, and view models are relatively flat. When the domain is event sourced, the persistence format is flat. This opens up the doors to alternative databases (such as NoSQL) which tend to be faster and easier to work with. Which leads me to my next point...

NoSQL Database

A traditional SQL database provides a lot of capability for reporting. However, it's a real pain to work with for application data. Most of us don't think about it because it's become second nature. Running a basic SQL statement requires you to 1) have a magic string somewhere with the query or stored procedure to run, 2) wrangle objects like SqlDataAdapter, SqlConnection, etc., 3) map parameters into the query, and 4) try to run the query and interpret the results. (For select queries, there's the additional pain of mapping/casting the DataSet back to an object.). This is painful enough that most of us create abstractions around this process. The first evolution is a DAL that maps method calls to SQL statements. Later evolutions end up being a repository and/or ORM. An ORM requires that you stay abreast of the ORM-specific extensions and code around its design. A repository (or even a simple DAL) requires manual coding and maintenance. In the end, no matter what you do, dealing with SQL is a lot of work. Contrast that to the potential ease of using NoSQL databases, which could take your object as-is and store it straight to the database (e.g. db.Store(message);). That's with no custom-built abstractions, no magic string SQL statements, and no extra ORM framework to learn. That's one compelling persistence story. Even if you need a SQL database for reporting, this can just be an additional integration point as though it were another read model to update (from the CQRS story). :)

Additionally, some NoSQL databases have REST APIs, which means I wouldn't even have to implement a read-layer for the HTML5 UI. The only part that bothers me about the REST API is security. And I haven't yet researched my options there.

WebSockets
The websockets feature is one of the most exciting web technologies to come out in a while. It allows the server and client (e.g. browser) to push messages to each other in a low-impact way. Previously, I wasn't using any external bus (e.g. MSMQ), because the overhead and administration needed to use it wasn't worth it. But websockets are rather simple to get running and are still a developer concern (as opposed to an administrative concern like MSMQ). I want to use websocket connections to serve as my program's contact with the outside world. It's not a durable bus, but since my program is in-memory and not distributed, that really doesn't matter.

One other interesting point is that web sockets provide better asynchronous capabilities. If I send an AJAX request from JavaScript, to an MVC Action, both the AJAX request (on a separate browser thread) and the MVC Action are kept open and waiting until the program finishes the action. Using websockets, I have the capability to take the command, hand it off to a queue, then have a callback send a message back to that client notifying them of completion. I can also have an event websocket to allow external listeners.

The downside to WebSockets is the feature is not widely supported currently. To host WebSockets natively in IIS, you currently have to have Windows 8 or Windows 2012 Server. As far as clients, WebSockets is not supported on most browser versions aside from the current crop.

HTML5/Javascript UI w/ MVVM
HTML5/Javascript is pretty much the direction that the web has taken. I ruled out both WebForms and MVC (the design, not the project type) for the UI due to both performance and knowledge dependencies. Both approaches (traditionally) post back to the server, let the server make some UI decisions, and then send the decision (or command) on to the business layer (or domain). It's basically an extra hop (to use a network term) as compared to a purely in-browser UI. And as we all know, external communication is often the most expensive part of a given operation.

But performance alone is not enough, and in fact using a solely HTML5/Javascript UI is only possible due to some nice Javascript frameworks. My personal choice in the matter is jQuery and Kendo UI (which has controls, MVVM, data sources, etc.). With MVVM, you can completely separate the view from the model, which makes working with the code a lot easier. I end up with the following for each view: view (.html), style (.css), view helper (.js, for any extra control-related functions the view might need), and view model (.js). Then for more complex scenarios, I add more view models and/or script files which listen for view model changes and react accordingly. It's all pretty fast.

This style of development does take some getting-used-to compared to server-side UI development. But one of the reasons I like Kendo UI is because there are a lot of functions that are pre-integrated as compared to taking separate libraries for UI controls, MVVM, validation, etc. and trying to integrate them together.

16 September 2012

Getting .NET 4.5 WebSockets Working

I just spent most of the afternoon trying to get WebSockets working with Visual Studio 2012 and an MVC project. Most of the examples seem outdated (from old RC builds) or too low level, and the few updated ones weren't complete. So, here's my little guide on how to get started with a simple echo/chat program. It just takes 2 classes and a test page.

NOTE: WebSockets on IIS only works on Windows 8. Windows 7 does not have the necessary websocket DLL that is needed by IIS. :( That wrinkle aside, WebSockets will work with IIS 8 Regular or Express editions. This demo used IIS Express.

First, you must create a new Web API project. In this example, the project is named (with great flare and creativity) "Project1". Then, you need to add the Microsoft.WebSockets package from NuGet. (Right-click the project, click Manage NuGet Packages..., search field in upper right: microsoft.websockets, select the Microsoft.WebSockets package and click Install)

First create your WebSocket Service class, and put it somewhere in the project (mine is in /Controllers):

using Microsoft.Web.WebSockets;

using System;

namespace Project1.Controllers

{

public class ChatClient : WebSocketHandler

{

public readonly Guid ConnectionId = Guid.NewGuid();

private static WebSocketCollection chatClients =

new WebSocketCollection();

public override void OnOpen()

{

chatClients.Add(this);

chatClients.Broadcast(

"Client joined: " + ConnectionId.ToString()

);

}

public override void OnClose()

{

chatClients.Broadcast(

"Client left: " + ConnectionId.ToString()

);

chatClients.Remove(this);

}

public override void OnMessage(string message)

{

chatClients.Broadcast(

ConnectionId.ToString() + " said: " + message

);

}

Note I'm using Guid as a connection id, and the messages end up looking pretty ugly: 0195093f-70a5-4bfe-b707-8ac96ba94c31 said: test. But you can change that for your own needs.

The next step is to setup an ApiController. This is necessary to upgrade the HTTP request to a WebSocket request.

using Microsoft.Web.WebSockets;

using System.Net;

using System.Net.Http;

using System.Web;

using System.Web.Http;

namespace Project1.Controllers

{

public class WebSocketController : ApiController

{

public HttpResponseMessage Get()

{

HttpContext.Current.AcceptWebSocketRequest(

new ChatClient()

);

return new HttpResponseMessage(

HttpStatusCode.SwitchingProtocols

);

}

As noted in another example I found, the first using statement is VERY IMPORTANT. It adds the AcceptWebSocketRequest overload that is needed for this code. The other overloads are lower level than I wanted.

That's it! But wait you say, how can I test it? Ok, here ya go. This is a simple html page I created to test the application. It doesn't use any external files (not even jQuery). You can replace the contents of Views/Home/Index.cshtml in the project with this:

@{
Layout = null;
}

<!DOCTYPE html>

<html>
<head>
<meta name="viewport" content="width=device-width" />
<title>Index</title>
<script type="text/javascript">
var connectButton,
disconnectButton,
messageInput,
sendButton,
responseDiv,
uriSpan,
uri,
webSocket;

var connect = function () {
connectButton.disabled = true;
disconnectButton.disabled = false;
sendButton.disabled = false;
webSocket = new WebSocket(uri);
webSocket.onmessage = function (e) {
responseDiv.innerHTML +=
'<div>' + e.data + '</div>';
};
webSocket.onopen = function (e) {
responseDiv.innerHTML +=
'<div>Connecting...</div>';
};
webSocket.onclose = function (e) {
responseDiv.innerHTML +=
'<div>Disconnected.</div>';
};
webSocket.onerror = function (e) {
responseDiv.innerHTML += '<div>Error</div>'
};
};

var disconnect = function () {
connectButton.disabled = false;
disconnectButton.disabled = true;
sendButton.disabled = true;
webSocket.close();
};

var sendMessage = function () {
var message = messageInput.value;
webSocket.send(message);
messageInput.value = '';
};

var setup = function () {
connectButton = document.getElementById('connect');
disconnectButton =
document.getElementById('disconnect');
messageInput = document.getElementById('message');
responseDiv = document.getElementById('responseLog');
sendButton = document.getElementById('sendMessage');
uriSpan = document.getElementById('uri');
uri = 'ws://localhost:52618/api/websocket';
uriSpan.innerHTML = uri;
};
</script>
</head>
<body onload="setup()" style="font-family: sans-serif;">
<div>
<div>
<span id="uri"></span>
<button id="connect" onclick="connect()">
Connect
</button>
<button id="disconnect"
disabled="disabled"
onclick="disconnect()">Disconnect</button>
</div>
<label for="message">Message</label>
<input id="message"/>
<button id="sendMessage"
onclick="sendMessage()"
disabled="disabled">Send</button>
<hr />
<label for="responseLog">Response</label>
<div id="responseLog"
style="border: 1px solid grey;
width: 600px; height: 400px;
overflow: auto;
font-family: monospace;">
</div>
</div>
</body>
</html>

NOTE: Change the uri value to match the port your project uses. Otherwise it should work as is.

And here was my test run in Chrome 21 and IE 10.

15 September 2012

Choosing a Database

Choosing databases used to be pretty straightforward. The answer was always a SQL-based relational database. Nowadays the landscape is different.

I decided to use a non-relational database (referred to as NoSQL) for my next project. Why? Because I don't need it. The plan is to use denormalized read models to service UI views and use event sourcing for my domain objects. Neither of these things require relations at the database level. Further, with the use of a document database, I could even nest some relationship data that might be required for the view.

So which document database to choose? Firstly, my project will use HTML/JavaScript as my main UI, so a REST API into the database will save a lot of code. A document DB with a REST API pretty much narrows it down to CouchDB or RavenDB. What follows is my decision process for choosing between them for a .NET web-based application.

Note: Please correct me if you see any incorrect information!

Licensing

-CouchDB is free and uses the Apache 2.0 license, which is not all that restrictive and can be used in commercial software.

-Raven is dual-licensed with the AGPL (which is the nastiest viral license out there) or a commercial license (which you pay for).

Being as my next project has no guaranteed revenue but I wanted to retain the rights to sell it commercially, paying for a database wasn't ideal when a free alternative exists, and I almost settled on Couch right here.

Note that Raven's AGPL license has an added exception that allows you to use a different OSI-approved open source license for your project that uses Raven and not have to convert to AGPL. But it doesn't free the users of your project from the AGPL's constraints.

Clients

I need a .NET client, primarily so the server can update read models.

-Raven has a built-in .NET client with support for LINQ queries, performing updates, and defining indexes in a .NET language. (And a lot more things that I put in the Developer Capabilities section.)

-CouchDB has only the REST API. There are many user-contributed libraries to wrapper the REST API, but none have the depth and capability of Raven's client. Non-trivial database instructions (like updates) must be written in Javascript, which is a bit awkward from a .NET language.

-Both databases have Web-based administration tools.

Developer Capabilities

-RavenDB as lots of advanced options: loading related documents (almost like a relational database), full-text queries, partial updates, set-based deletes and updates, transactions, spatial queries, ... the list goes on.

-CouchDB has more capability as an application server with design documents, validation-, and show-functions. But I don't need this, and I don't like this mix of concerns. Also, CouchDB is missing an important capability for my project; partial updates.

Deployment Stories

-RavenDB can be run as easily as referencing the DLL in my project and newing up the store (embedded mode), or can run as a Windows service, or an IIS application. RavenDB server is tied to the Windows platform (although the client can run on most anything through Mono). This would be a downside for some, but this is the platform I'm targeting.

-CouchDB is supported across many platforms, but it has basically one deployment story: run as a service (or daemon), with or without console interaction.

Distributed Stories

-RavenDB has replication and sharding support. Sharding is decided on the client, though.

-CouchDB has replication, load balancing, and clustering. The latter two are more for data-center scenarios, whereas Raven's sharding is capable of dealing with data locality (by geographic region, for instance) in addition to just hashing by key for data-center scenarios. Although, load-balancing and clustering are decided on by the server.

Conclusions

Based on my deciding factors, RavenDB is clearly the winner. Targeting the .NET platform, RavenDB brings much more capability to bear than CouchDB. It is also much easier to integrate with .NET code. Sharding support brings better scalability than replication for geographically distributed applications. The only downside is there is a cost for commercial use. However, that cost is well justified considering what it brings to the table, and the amount of integration time it saves with .NET. It's also still available for free to open source projects.

10 September 2012

IE8 Javascript Performance

Today I was faced with this little peach of a problem in IE8. Horrid performance, as demonstrated by the following IE8 Javascript Profiler shot.

There weren't any noticeable performance problems in IE9 or Chrome, so I figured this was just IE8's slow-boat Javascript performance. I started looking around for problems with jQuery.removeClass() in IE8, but didn't find that much.

Upon further examination, the problem was actually a couple of unintentionally recursive calls. I am using Kendo UI MVVM, using one view model for the entire page, but I also have some fields whose members are calculated by script when the model changes. (I have my reasons!) So, whenever those properties would get updated, the change event on the view model would fire, and the properties would get calculated again, which would then trigger a change event, etc. Since the code was doing deltas on the calculated object, eventually no changes would be made, and the recursion would end. Further testing revealed that the code was getting recursively called about 75 times and 7 seconds too many.

My first thought to fixing this was place conditionals on the change event:

viewModel.bind("change", function(e) {
if (e.field != "someProperty" && ...) {
// recalculate
}
});

But I realized that future-me is unhappy with this solution. This would be a pain to (remember to) update when I added features to this page. So instead, I split the view models. There's one for user-updated fields, and another for manually calculated fields. The calculated fields trigger a recalculate only when the user-updated view model is changed. This nicely got me back down to acceptable performance.

IE8 apparently still hates removeClass, but it's a lot better than before. Note that the same (fixed) code in IE9 runs all removeClass calls combined in 1ms, and the biggest hog only takes 28ms, and the next-largest is 9ms.

Morals of the story:

Users should stop using IE (a moral of every web dev story)
I should to sanity check JavaScripts with a profiler.