29 October 2014

Video encoding

I was reading about VP8 and H.264 because my application will have to handle video, and I wanted to be familiar with what's going on. As part of that, I considered a method to "encode" a video in a way that is highly parallel, maybe low on space, and with little-to-no loss. Being just a simple man from the South, I'm sure someone a lot smarter than me has already thought of this, but here are my ideas.

Moving pictures

In simplest terms (at least in my limited knowledge of the video industry), raw video is an ordered sequence of pictures (frames). So each frame is a 2-dimensional grid of pixels. From video gaming, at least 60 frames per second is considered roughly optimal to make the human eye unable to perceive frame changes (it just looks like motion). 30 fps and up actually works for this, but an occasional frame change is still observable. At 1080p (1920x1080 resolution) there are 2,073,600 pixels in each frame if you look at it as an individual picture. A pixel's color is represented by a combination of color values. From the web, I know that one way to represent RGB colors is by using 6 hexadecimal digits; 2 for red, 2 for green, and 2 for blue. 6 hex digits takes up roughly 3 bytes. So that adds up to 5.9MB per frame to represent all the pixels. At 60 frames per second, that's 355MB per second of space. A 2 hr movie would be 42.7GB in raw form, not counting audio, delimiters, and format info.

Maths

If you look at an RBG color as a vector, then each frame is vector field. Though not quite, because RGB coordinates would always be integers. It'd be more like a vector ring, but probably assuming it's a real number would be sufficient. So anyway, if we extend that further, we could theoretically find a formula to exactly fit the changes of one pixel over time. (A small part of me wishes I had not dropped Vector Analysis in college.)

With the way most movies are actually edited, a single function would probably be pretty hairy, take a long time to generate, and take a lot of calculation to get the value for each pixel. But inside of a single shot, the function would probably look pretty smooth and have a fairly easy formula. Some exceptions are likely (like a night scene with gunshots, causing wide swings of color and intensity). So the one end of the spectrum you have one monolithic formula per pixel, and on the other end is a piece-wise function, with formulas for each shot.

Formula-fitting each pixel is an inherently parallel process, with each pixel considered separately. Optimization can occur afterwards. Video cards could be used (both encode and decode) for this since their computational power is all about parallelization.

Considering there are over 2 million pixels at 1080p, would this method actually save any space? A movie like The Matrix with over 2300 shots and using an accurate piece-wise function could be 10x larger than the original (my rough guesstimate based on a 50 ASCII character formula per shot per pixel). However, there are some "nerd knobs" that can be tweaked to make it theoretically small enough to be consumable. For one you can always increase the margin of error on the function, which will generally make the functions smaller and simpler while sacrificing accuracy. There are also optimizations you could make like sharing a particular formula across multiple pixels. Consider a night shot where many of the pixels will be the same shade of grey. Or consider that many pixels for a given scene will share a similar intensity, but will just be color shifted from one another. Or vice versa, same color, but different intensities (black and white film, for instance). By identifying those formulas which are simple transformations of others, there is an opportunity to conserve space. Although the sheer number of addressable pixels creates a lot of overhead for such optimizations. This volume of data is as much an exercise in organizational efficiency as anything.

Anyway, it is an interesting though experiment. Looking at a video as a series of vector fields is something I hadn't considered before.

10 October 2014

Picking a client platform for a new system

What follows is my journey to picking a platform for a particular client in a new software system. It may not be the best answer for your case, but the considerations may still be relevant.

When tasked to develop a new business system one of the (many) choices to be made is the platform internal clients will run on. It is an important decision that will have impact for years to come.


The Desktop


Our first inclination was to create a Windows desktop app. All workstations are Windows machines, our team skillset is the .NET platform, and there were a number of local resources that need to be accessed (specialized printing, camera access, signature pad, and accounting software integration). So then I began to look at .NET desktop platforms: WinForms, WPF, and WinRT. WinForms was dismissed right away because of it's lack of extensibility and modern feature support (hardware acceleration, data binding, and flexible controls to name a few).


WPF


The harder decision was between WPF and WinRT. As near as I can tell from my research, WPF and WinRT apps are not all that different to develop (XAML/.NET, although WinRT has other options too). I even started a prototype app using MahApps.Metro and began digging into learning WPF. However, there is a cloud of doubt surrounding the future of WPF. There were some new WPF bits in .NET 4.5, but overall there has been very little activity or advancement in WPF for many years. Things that were tedious generations ago (that is, computer generations) are still tedious, with no official sign that it will improve. Considering it's age and stagnation, it's hard to pick this as the platform of the future for the new clients.


WinRT


I also did some research on WinRT. As far as I've found, you can make a .NET desktop app by starting with a Windows Store project template and manually modifying the project to enable desktop usage. You can also manually add references to WinRT libraries from other project types. My main issue with WinRT is that much of its design revolves around "metro" and the Windows Store, which is generally not being embraced by the industry. To me this makes WinRT's future speculative at best. Not to mention that WinRT will only run on Windows 8 right now, which the IT department is skipping. The impending Windows 10 release probably next year and a cooperative IT department willing to upgrade would ordinarily make this choice not so bad. But the larger question of whether the Windows Store underpinning will make it as a desktop app platform (which doesn't seem to be WinRT's primary consideration) gives me great pause. Microsoft has managed to leave an unstable vacuum in desktop development, which makes me concerned about significantly investing in that space.


Now what?


So where does that leave me? Well, my background is the web, and although I was looking forward to broadening my skill set to a desktop technology, it doesn't appear (from a platform perspective) that there is an good one to pick. I was also not very encouraged by my foray into WPF. It felt like almost a backwards step from HTML5 as a front-end technology. Don't get me wrong, HTML5 has a lot of room for improvement. I always like to say (with some embellishment) that Javascript is the worst possible tool for the job, but it is the only tool that can do its job. (This is sortof a twist on the Python creator's comments about PERL.) But I will say that XAML markup is quite verbose, especially since a lot of the changes I wanted to make required custom implementations of the entire control (even if mostly pasted from the default template). And the code required for data binding is crazy verbose compared to something like knockoutjs's ko.observable(). A lot of styling that is pretty straightforward in CSS feels weird in XAML... e.g. hover/active color changes. CSS3 animations are also amazingly simple. And considering that HTML5 is actually experiencing a LOT of improvement of late, and more in the future, it seems like a good client platform choice for moving data.


Web Issues


However, there are a couple of problems with this choice. Firstly, let's talk about browser compatibility. This is the main thing which holds back HTML5/CSS3 as a platform, but that mainly concerns web pages out in the wild. Consider that with a desktop app, IT would be required to install my app on machines. Now instead of that, IT will be required to install a different app on machines -- a modern/HTML5 browser. And there's more than one to choose from. What about cross-platform compatibility? You can install recent versions of Chrome of Firefox on a broad range of OSs and versions, probably on any workstation your enterprise runs.


Signatures


Then there is local resource access, the primary reason we wanted a desktop app. With some changes to our workflow, we are able to simplify the process for customers and also get rid of the need to interface with a signature pad -- its function will be integrated into a tablet which will be used for other parts of the process. Using a browser that supports getUserMedia, our app can also access the web cam. So the main difficulties left are the specialized printing and integration with accounting, which do require local resource access that browsers cannot provide.


Direct Printing


Normal printing is not really a problem from the browser, but we must print to specialized devices that may use their own printing languages (like zebra printers). Before the HTML5 discussions, we had already decided to make a Windows service for printing which would directly talk to the printer so prints can be automatically triggered based on system events. Since the service will already be directly talking to the printer, it can handle manually triggered printing from the web app as well. So that problem was already solved.


Accounting integration


I don't have a solution designed for the accounting software, because we don't even know what we will be using yet. (They are extremely unhappy with their current accounting software and want to change away from it regardless of my effort.) In any case, all the client will be able to do is make the request (e.g. to create invoices), and the server will take the process from there; running it directly if the accounting software supports it, or delegating it to a custom service on the accountant's machine in the worst case.


Chrome app?


I also looked at developing this as a Chrome app, which provides limited access to local resources. However, I really didn't like the idea of making proprietary modifications that make my HTML5 app not run elsewhere. It also seemed like for every local access I got, some normal web access was restricted due to sandboxing. That, and really the only benefit I would get is printing, which I had already resolved. The accounting software would probably still not be accessible from the Chrome app.

So anyway, that was my decision process. Going forward, we are looking at using Angular as our UI technology, and Bootstrap as a front-end (wanted to use Foundation, but Bootstrap was easier to integrate). On the server side, it's .NET Web API.