Monday, January 11, 2010

SCM Trade-Offs

Source Control Management (SCM) is, on the surface, a very simple topic.  You put all your source code in one place, then you check in and check out from there.  Advanced systems support merging changes if two people edit the same file.  And there you go, SCM in a nut shell.

And if you are just one person, or a small team, that's probably where the story ends for you.  But with larger teams, or more complicated application environments SCM inevitably morphs into Application Lifecycle Management (ALM).  ALM covers topics from requirements, to architecture, to testing, and release management.

I have found that typically it is ALM requirements that end up introducing branching to your SCM.  Branching is another simple concept.  You make a copy of the source code which can later be "easily" merged back with the original.  Unfortunately branching is another area that quickly gets much more complicated than all that (which I've written about before, in a surprisingly humorous post if I may say so myself...).  As the number of branches increase the amount of work required to track them and merge between them increases.

So you will always find that there is a delicate trade-off that must be made when you introduce a branch.  Branches give you isolation, letting people work independently from each other.  There are tons of reasons why this can be helpful, some of which include:
  1. You can work without fear of breaking other people's work
  2. You can try "experimental" stuff which may never actually be finished
  3. You can work on parallel features and release one without releasing the others
  4. You can do new work while still enhancing an old release (ex: service pack releases)
But branches also introduce the need to integrate.  If everyone works on the same branch, every check in is an integration.  And these happen so frequently it's hard to even think of them as "integration."  But if you introduce long lived branches, integration can now be delayed indefinitely.  There are plenty of horror stories about companies which tried to delay integration to the very end of application development and then spent almost as long "integrating" as they spent "developing."

This is where Continuous Integration comes from.  The idea is that you want to integrate very often (at least once a day according to Fowler), and that you want integration to not just mean merging code, but also running tests to verify that the integration is successful.  The point is that you want to catch integration issues early.  The reasoning is the earlier you catch them the easier they will be to fix.  The longer you wait to fix them, the more things will have diverged, or be built on code that needs to change.  So integrate often.

And this is where we arrive at a problem I've been struggling with for quite some time.  On the one hand, I want isolation for all the reasons I've already mentioned.  And on the other hand I want to integrate continuously to avoid all the pitfalls mentioned.  But you can't have both.

Naturally I've tried to look at how other people tend to deal with this issue.  There seem to be two big picture approaches:
  1. "Centralized" in which everything starts integrated, and isolation must be purposefully added
  2. "Distributed" in which everything starts isolated, and integration is tightly controlled
All open source projects I've looked at are run very "distributed."  Anyone can download and update the code in their own sandbox.  But to actually get that code into the project you must submit patches, which the project leaders review and apply if they pass their standards.  This works well for Open Source projects because you can't let anybody waltz in and start changing your code.  You need control over what code is being added and modified, and you want it to all be reviewed and tested.

In contrast most company projects seem to be run "centralized."  Mostly I've found accounts of a main branch with release branches for past releases and maybe a few feature branches.  You can work code reviews into a process like this.  Some systems support preventing check ins without code reviews.  Other people just make it part of the "process."  But in general, people are trusted to be making the right changes in the right way, so the control required in OSS projects isn't as strictly needed here.  You trust your own employees.

I suppose I should mention quickly that my words "centralized" and "distributed" DO line up with the differences between Distributed and Centralized Source Control Systems (like Mercurial vs. Subversion), but you can still use a Centralized Source Control System and work in a "distributed" fashion by using patches.

There are lots of reasons why you might work one way or the other.  But I still have a hard time figuring out what's right for me.  This could be because my requirements are unusual.  I'm not sure why they would be, but I haven't found other people worrying about them.  This makes me think either my requirements are very special, or they're crazy and I'm missing something.

Briefly, here's what I'm dealing with.  I have a number of different people working on the same application, but they are all working on un-related or only loosely related features.  We have frequent and regular releases of the application being done.  We try to target features for the appropriate release, but its incredibly hard to know if a feature will really be "done" on time.  Features get pushed back when:
  1. They under go dramatic changes when it reaches completion and users try to put it through its paces
  2. The development takes longer than expected
  3. The developer gets pulled off and placed on another feature
  4. The feature gets put on hold for "business reasons"
If everyone is working on all this stuff in the "main" branch and integrating continuously, what do we do when one feature is not ready to release but the rest are?  The changes are all tangled up in the same branch now and there is no easy way to un-tangle them.

The only answer seems to be doing any "non-trivial" work in Feature Branches.

But feature branches pose whole challenges of their own!  If you're writing Mercurial, then feature branches are no problem.  If you're developing a website with a database then feature branches are a bit harder, because you have to create a new database and fill it with test data.  But if you're developing an enterprise application with a complicated database (which can't just be loaded with generated test data) that talks to other databases (with linked servers and service broker) and has a large number of supporting services many of which depend on OTHER 3rd party licensed services; then what?  In that case, creating the full application environment for each feature branch is actually impossible.  And creating a partial environment is certainly not easy.

But it is precisely because of all of this complexity that we want the isolation!  But the complexity makes the isolation difficult (if it's even possible, which I'm not sure of yet) and time consuming.  I love paradox!

Perhaps you're saying to yourself, "Maybe you wont be able to actually run the app and do everything it can do in your feature branch, but surely the code is written so all these things that depend on other things are nicely abstracted so that you WILL be able to run the unit tests!"  Um, no.  Sorry.  It pains me to admit it, but I'm afraid that isn't the case just yet.

So where does that leave me?  Well, I want to implement continuous integration to reduce the integration issues and make releasing easier, but I also want the flexibility to develop features independently so they don't prevent other features from being released, but my environment is so ridiculous that I can't figure out how to set it up so feature branches are fast and easy to create (and more importantly useful).

So, can you help me?  Do you have any of these problems?  Have you managed to mitigate any of these problems?  Am I missing something?