Sunday 7 February 2010

Advanced Distributed Systems Design with Udi Dahan

I spent last week in Melbourne for a 5 day training course on distributed systems, run by Udi Dahan. I’ve been a fan of Udi’s work for a while now, so when I discovered that he was on his way to Australia I jumped at the chance to be immersed in his approach to designing scalable and reliable distributed systems. Let me tell you – I was not disappointed. I found Udi’s course to be very challenging – not in terms of difficulty, but rather that it was full of ideas that challenged my preconceptions. I was also very impressed with Udi’s presentation and communication skills. He really knows the material and has an amazing ability to give helpful, informative answers to really dumb questions :)

The course was focused on a few key ideas:

  • Service oriented architecture
  • Message based systems
  • Command / query responsibility segregation
  • Domain models

If you’d like a more detailed breakdown of the course content, Udi has provided an overview on his website.

I’ve tried to understand SOA before and struggled with the abstractness of the concept. I guess part of my difficulty has always come from the fact that I still approach the world with a “developer” mindset – I’ve never really bought into the concept of a non-coding architect and sometimes I struggle to understand the meaning and implications of an idea if I can’t see the actual effect on the code we write.

Thanks to Udi, I now have a basic understanding of what SOA means. I am comfortable with the idea of identifying business capabilities and describing them as services that consist of business components, which in turn consist of autonomous components that are individually deployable. I’m also comfortable with the idea of these autonomous components communicating with each other via a message bus using a combination of asynchronous request-reply and publish-subscribe. I am not yet sure how well my understanding will map to the existing literature on SOA, as I got the impression from some of the other students that Udi’s version of SOA was somewhat different to the SOA they had read about. My impression of SOA is that it is a slippery subject, where only the most abstract and vague definitions are widely accepted. That said, I can see a great deal of value in the analytical process and technical approach that Udi taught us.

Having skimmed Enterprise Integration Patterns, I was already familiar with the idea of message based systems and the advantages of temporal decoupling before I began the course. But Udi’s coverage of the topic helped fill in large gaps and also gave me a great deal of appreciation for the power and simplicity of his open source .NET based service bus, NServiceBus. My co-worker Malcolm nearly did a backflip when he saw how easy it is to begin using NServiceBus alongside WCF.

Command / query responsibility segregation (CQRS) is about recognising the difference between queries and commands as part of system architecture, and designing completely different paths for handling the two types of operations. Going into the course, I thought I was comfortable with the idea as I’d watched a number of videos on the topic from Greg Young, and had read a number of blog posts. It turned out that Udi had his own take on CQRS that I did not pick up on when reading his posts on the topic. Speaking of Udi’s posts on the topic of CQRS, I highly recommend this one if you’d like to learn more.

So how was Udi’s version of CQRS different to what I expected? One example immediately comes to mind. A necessary part of CQRS is some sort of persistent view model to facilitate the queries, and a transactional data model to facilitate the commands. So far, so good. Udi then introduced the idea that there may be some relationships that are only recorded in the persistent view model, and not in the transactional model – if later on you need that data in your transactional store, then backfill it with an ETL from your view model! Boy did that raise some incredulous looks, especially from yours truly! “What happened to my single source of truth?” I cried.

Udi also had some interesting ideas about domain models. In particular, we went into quite a bit of detail on the idea of aggregates, and how to ensure consistency. Has it ever occurred to you that perhaps there is something wrong with loading a Customer, and then lazily loading their Orders at some later point? What happens if one of the Orders are modified by another process, after the Customer is loaded but before the Orders are lazily loaded? Now the model loaded into memory is inconsistent! Even if your optimistic concurrency will protect you in most of these scenarios, its entirely possible that you’re updating some other object based on logic that has operated on your inconsistent snapshot. This problem seemed obvious once it was pointed out, but it had never occurred to me before.

There were so many interesting discussions and points of contention over the 5 days, this post has barely begun to skim the surface. It really was an eye opening experience for me, and some serious food for thought. If you ever get the opportunity to attend Udi’s course, don’t hesitate. Its worth it!

Now that I’ve finally managed to get something written down on the course, I’ll let myself read some of the summaries written by other attendees.