This is a question I’ve been asking myself for a while. It’s not a fully-thought out argument (that’s why it’s still a question), but it’s a train of thought that I think warrants some investigation. I’d love to get some opinions from people with good or bad experiences of using DVCS with Agile as to how this plays out practically.
So, here’s my train of thought…
Easy branching and merging is the killer feature of Git and Mercurial.
They improve on other centralised systems (Subversion, CVS) in many other ways, but branching and merging is the reason that’s always used to sell the switch. The question I want to raise is whether branching and merging are good tools for an agile development team, or a nuisance.
Branching is the Git Killer Feature Because Its Creators Needed It
Git was created by Linus Torvalds and the Linux kernel team. As a globally-distributed group of hundreds of volunteers working on any number of different enhancements at varying paces, the kernel team has a lot of work in progress, many teams working on orthogonal sub-projects within shared codebases, and a hierarchy of trust through which pull requests must be carefully ushered in order to be integrated. This is not the reality of most agile development teams.
Branches are inherently about creating isolation.
Why do you create a branch? Only for one of two reasons: to isolate those using existing branches from changes on the new branch (e.g. a feature branch), or to isolate the new branch from changes in its parent branch (e.g. a stable branch).
Feature Branches, in particular, avoid integration.
Integration is the process of taking your changes to the code, mixing it with other people’s changes to the code, and checking everything compiles, runs and passes tests so that you have a piece of potentially-releasable software. If you have a feature branch then you are intentionally not integrating; you are doing the opposite: isolating.
Side note: you can’t integrate by pulling without pushing.
I once raised with a DVCS evangelist the fact that having feature branches isolates you from other people’s changes. He told me that this wasn’t the case because his team (which was part of a larger project) was pulling from the main branch every day and hence integrating everyone else’s changes. He didn’t seem to realise that, seeing as every team on the project was using feature branches, there was little being merged into the main branch for them to pull, at least on a daily basis.
“Working software is the primary measure of progress.”
This principle is attached to the Agile Manifesto for a reason. In the good ol’ days, teams used to build separate components of their software in isolation, having agreed on how they would interface, then try to integrate them as a last step. Components would be “completed” according to the project plan, but there was no “working software”. You couldn’t say the component “worked” because it couldn’t actually do anything without it’s dependencies. Of course, once all this code developed in isolation was slapped together and tested, the usual outcome was, “Hey, look at that, it all worked seamlessly.” … said no one, ever.
Continuous Integration prevents the pain of irregular integration.
Let’s be clear about what Continuous Integration is. It doesn’t mean running a CI server. It means checking valuable changes into the VCS mainline as often as possible. When Martin Fowler discusses continuous integration, he posits a general rule of developers checking into the mainline at least once a day, with a preference for more often.
It’s not unusual to see people who are still getting a handle on continuous integration fall into the trap of not checking in daily. I’ve noticed two common causes: either the task wasn’t broken down enough to be committable, or some change led down a rabbit hole and the pair didn’t realise until they were too deep. It can take as little as three or four days before developers will find that integrating their growing change set with the constant flow of small changes from the rest of the team turns into a moving-target nightmare. Such episodes will often be resolved through a request for the rest of the team to “please not commit anything for just a little while”. Developers who’ve been through this will usually realise straight away where they went wrong, share what they learnt with the team, and often become some of the biggest advocates of checking in frequently.
Continuous Integration is also a major form of communication.
In fact, Martin makes the claim that CI is primarily about communication. As developers, we are usually changing a small subset of a codebase at a time, but in order to make that change we’ll typically browse a much larger subset in order to orientate ourselves, check the details of things we depend on and ensure we’re following established patterns. While there’s often a relatively low chance that someone else on the team is changing the same files I’m changing, there’s a much higher chance (c.f. the birthday paradox) that someone on the team is changing something that someone else is relying on, but Git will never pick that up. And this is one of the really ugly and hidden dangers of delaying integration.
No matter how good Git is at merging branches and conflicts in text files, it can never merge conflicts of understanding.
Continuous Integration mitigates the risk of undetected conflicts of understanding because it reduces the window in which such race conditions can occur to mere hours. Feature branches, on the other hand (along with traditional long-lived, uncommitted change sets), extend this window to last for as long as the branch is not integrated into the mainline. Remember, branches exist so we can have isolation, and that isolation shuts down the communication of changes in the source code.
Slicing by feature rather than component probably reduces integration risks.
The old-school method of slicing work by component meant that, for a piece of functionality to come together, the work of many different people had to be integrated. A core feature of the way agile teams work is that we develop stories – complete slices of valuable user function. When slicing this way, it’s not unusual to have the same people work across layers, sticking with the story rather than with a component. This often means the same people construct both sides of the integration, so they have the same knowledge, which reduces (though certainly doesn’t eliminate) the integration risks. Add in pair programming and frequent pair swapping and the risks may be reduced even further due to people actively sharing their knowledge with others from the team – others who may have related or even conflicting knowledge. (Reducing conflicts of understanding!)
So, you probably think I hate Git, but that’s not true.
I’m using Git. I think it’s a great tool with a lot more functionality than any of the various VCSs I’ve used before.
My concern is the effect the Git Koolaid has on agile teams.
Lots of people are migrating to DVCS because that’s where the action is. With a switch to Git or Mercurial comes a natural inquisitiveness about the best way to use branches – because that’s what these tools are all about – and everyone on the web is talking about their superior Git workflow. Agile teams that are serious about practising Continuous Integration of value need to seriously consider these questions:
- Should we use branches for day-to-day development at all?
- Do the benefits of practices like Feature Branching outweigh the costs of delayed integration and limiting communication?
Dave Farley (co-author of ‘Continuous Delivery’) describes Continuous Integration as the process of automatically creating a potential release candidate after every commit. Are feature branches going to help you do that?
Can DVCS, agile development and continuous integration fit together?
As I wrote at the start, I’ve been thinking about this for a while but it’s still just a train of thought, not some proof that it can’t work. Many others have written about this very same thing. Martin Fowler has written a superb, detailed description of how feature branches cause big conflicts and dealt with the proposed alternative of ‘Promiscuous Integration’. Derek Hammer, also at ThoughtWorks, has written about his team’s (failed) attempt to merge GitFlow and Continuous Delivery (pun intended!). Jade Rubick at New Relic has written about many disadvantages of long-running branches, Paul Gross at Braintree Payments has written a similar list of disadvantages with feature branching, and Jez Humble just says Feature Branches are evil.
I’m yet to come across a blog that says “Feature Branching and Continuous Integration work really well together, and here’s how you do it…” If you’ve solved this conflict, I’d love to know how you’ve done it, and why. Please share in the comments.
Want to learn more?
Image credits:
‘Complicated‘ by Rohit Mattoo
‘civilwarstrategy1‘ by Avinash Kunnath
‘Green Light‘ by Stephen Geyer
‘Broken Crescendo‘ by Francesco de Francesco
Gitg screenshot contributed by Mechanical snail on StackOverflow
I agree with everything you have said. Because branching has become so easy and throw in the notion of pull requests for code reviewing purposes, it seems like a good idea to use branches for all feature requests and defects that your team works on.
As you said, branching slows down the integration and feedback loops and even if you are constantly rebasing your branches, your own set of changes are not being propagated back to the rest of the team.
A major part of any good developers day-to-day activities should be refactoring. Sure you still need to deliver the feature, but as you make your way through the code base, you should be cleaning up the technical debt you come across (at least in the area of the code base you’re dealing with for your area of work). I find that I’m more likely to be more reluctant to do my usual day-to-day refactorings to the code base as I worry that doing the merge back down could be difficult after others have had their hands in the same area. Also doing a pull request for a feature that contains mountains of code that even though cleans up the code base to no end will make it more difficult for others to review the pull request. Of course I could move to a different branch and make those refactoring changes but I’m constantly making refactoring changes as I troll through the code base and I am definitely not going to switch branches every time that would happen.
I believe that having a clean code base where all the developers are on the same page in regards to constantly refactoring and writing tests to the point of 100% (or near to) code coverage makes for a master branch that is in a constant state of readiness for release. And thus enabling developers to refactor without fear.
I watched a video recently (..I’m pretty sure it was this company) about LinkedIn where the presenter was discussing moving from a branched model to working purely on master to increase their ability to release. If you have a code base and a team revolved around a clean and fully tested code base, I don’t see a need for branching so much except for maintenance branches. The toughest part of this approach is education. Education about (1) refactoring: teaching your team how best to refactor and get them into the habit of doing it ALL the time. Renaming a variable to be more meaningful has to be right away and pushed right away; and (2) testing: How best to write tests to get most effectiveness out of as few a tests as possible with maximum range. Sometimes writing too many micro unit tests can make refactoring more difficult as each little small test is coupled to each individual class. Whereas more ‘integration’ tests that test a more higher level entry point can still tests those micro bits of logic allowing you to retain those tests while refactoring and keeping the code base nice and clean.
Just my thoughts…
I think you have it wrong here, and it’s for a very simple reason: Git’s not particularly good at branching. SVN is equally good at branching. Git’s really good at *merging*. That’s actually why Linus built it — because he wants merges to be really easy. Git also doesn’t preclude quick-and-often merges to master, so what you’re really talking about is feature branches, and I actually kind of agree with you on this point, but let me explain.
Software developers use version control as software change management systems, not as versioning systems. By that I mean, with tools like SVN, you can’t do a “work in progress” checkin. This is a checkin that doesn’t even compile. Agile workflow aside, the “team” seriously shouldn’t accept something that’s a work in progress. Luckily, with a DVCS, you can checkin locally to your heart’s content, even on master! If you feel like you need a bit of a backup, you can have developer branches, so a developer / pair can work on this branch and checkin as they wish without affecting the mainline. However, by the end of the day, you can freely merge to master. The point isn’t that you branch more often to isolate yourself from code changes, it’s that you branch more often so you can checkin more often. You should be merging constantly!
Thanks for the comment, Sunny. It appears we both agree that the aim is to merge into origin/master as often as possible. FWIW, I would avoid the phrase “by the end of the day”; the aim is to integrate *as often as possible* with the typical constraint being /at least/ daily.
I don’t have any issue with the use of frequent local checkins. I don’t actually think of that as “branching” but as using git as a local VCS to maintain savepoints as work progresses. I do think, though, that the ideal is to work at breaking down your work in such a way that every useful bit of work can be pushed immediately. Richard Lawrence, below, claims to checkin about once every 20 minutes, so I imagine he is someone who’s become quite good at this.
If a development team are so nervous about hardware failures that they feel the need to push development branches to the central origin as a backup, I would suggest looking to upgrade the hardware rather than the branching model. 🙂
On the topic of hyper-frequent local check-ins, there is the issue of whether or not to squash when pushing. I haven’t formed an opinion on this either. I feel there’s a trade-off between having really detailed comments in master about why each change is made, but then having a huge number of commits which don’t represent an “expected to build” version of the evolution. Ideologically, I favour Dave Farley’s view that every commit should be creating a potential release candidate after every commit, but then you could argue that “commit” should translate to “push” in the DVCS case. Have you had any thoughts on that?
I need to chew on your comment for a bit, there’s a lot of food for thought there. Firstly, just to clarify a few things, I did mean merging at least once per day, but phrased it looking at the worst case scenario.
Your comment about developer branches is startling and correct. In the end if your machine dies, you’ve only lost a couple of hours of work. I’ve always found the idea of developer branches quite romantic, but now I’m struggling to think of a good use case.
But ho! You talk about squashing commits, and I’ve caught myself thinking in Git! See, in Git, branches aren’t really branches. They’re just tags that get moved along. Once you merge a branch, all the information about that branch is now lost. In order to “regain” that information, you take all of those commits and squash them together to make one pristine commit that other people can look at.
Mercurial, on the other hand, will forever associate a commit with the branch it’s on. I’ve been sort-of-translating to “Git-speak” without thinking about it, but it’s crucial. What SVN and Git people think of as a “commit” a Mercurial person thinks of as a whole branch! That’s why the “Mercurial mindset” is so different! You can’t “craft” your commits or otherwise mess with the history. You just checkin constantly at random intervals, and maybe even have the system checkin on your behalf. Then, when you’re ready, you merge your branch! The “master” branch only sees a single “commit” with all of the changes you made on the “feature branch” (or rather, a commit whenever you do a merge).
Feature branches make sense here, because what people think of as a “diff unit” is actually a branch with the “feature” on it. A developer branch also makes sense in this world, because a developer really ought to be working on only one thing.
But you mention something else interesting here: “every commit should be creating a potential release candidate”. I’ve heard a similar but different line of thinking: “Every commit needs to be individually reviewable.” It’s as if you were dealing with a pristine patch. I’ve had a hard time trying to reconcile this with my thinking that your IDE could potentially be checking in every time you write a line of code. I really reviled at Git’s branching model but I’ve never quite put my finger on what it does wrong. The reason is that Git’s branching model is bullshit! It doesn’t do real branches! And that’s why my explanation doesn’t make sense here, because I’m *thinking* “Mercurial” and *saying* Git as if they had feature parity…
I’m having a serious catharsis!
@Sunny: thanks to Mercurial’s bookmarks and rebase extension, you can’t really say that Mercurial does branching differently from git. You can, if you wish, do exactly git-style branching with Mercurial bookmarks.
I find that in many ways a git branch takes the role of an svn commit. That is, I merge to master just as often as I would have committed to svn when using svn. The difference is I break down that change into a number of small commits. This makes it easier for me to be agile (because I can do things that might work and then roll back if they don’t), and improves visibility for the rest of my team.
A key part of agile is limiting work in progress. Having the in-progress work on branches in the master repository rather than on individual developer machines makes this easier.
I can see how sharing “in-progress work” with your colleagues might seem like you’re enhancing the communication, but I would ask this: If the work is of sufficient quality to share with your colleagues, why would you not push it to master straight away? And on the flip-side, if it’s not of sufficient quality to push to master, why would they be interested in seeing it?
Much as in Agile planning, the key ingredient necessary to make feature branches work in an Agile workflow is to shrink the chunk of work. If you’re isolating code for most of a sprint, or for multiple sprints, then what you’re doing isn’t really Agile. You’re still doing a ‘big bang’ merge at the end of it. If however your feature branch is isolated for a small fraction of a sprint (a few hours or a day), then unless your team is very, very large, the reference branch isn’t likely to have drifted much. Similarly, your code will be exposed to the rest of the team quickly, allowing them to adjust as needed.
Back in the late 90s, when I was using Perforce and working in waterfall environments, my feature branches would be isolated for days to weeks, and would generally contain a large, top to bottom feature with many commits. I spent many hours doing large scale merges that touched tens or even occasionally hundreds of files.
Nowadays my feature branches are typically isolated for a few hours to a few days. The feature branch consists of one to a few commits, almost never more than 10. And the functionality chunks are smaller – the feature is logically complete, but is often best viewed a a piece of larger functionality that will be completed in a few separate PRs.
I find this approach works very well with CI. Structuring these features as PRs allows me to push to Github, run CI, and get CodeClimate feedback on these small chunks of functionality. It also supports review and feedback by the larger team. PRs are generally evaluated quickly, within an hour or two of being pushed up to Github. And if I have specific concerns, I can @ mention an individual with specific expertise. Works well for me.
I like @Peter Goldstein’s mixed style, but to give you a rebuttal from the branches-are-a-necessary-evil end of things I’ll point you to this comment: http://developers.slashdot.org/comments.pl?sid=1003121&cid=25461825
Thanks for the link, Yawar. It’s a great list of places where branches are super-useful, and I’m certainly not going to argue against branches being useful. We of course tag every production release and won’t hesitate to branch and patch when the need arises (quite infrequently, thankfully).
There is a very interesting item in that comment: “Showing [the] product to the client … every few hours is just plain impractical.” Sounds like a situation where the team has tried to introduce agile without actually explaining to the client what it means and what will be required of them, i.e. “sit together”.
Thanks for sharing your experience, Peter. When people like yourself say they can successfully use feature branches and be agile by merging into master after “a few hours or a day”, I always wonder: what is the point of the branch?
In your case, it appears that there is some amount of non-trivial lag (due to branch-based CI, CodeClimate and the use of PRs) between when you’ve finished working on a feature and when it can be merged. Obviously, you want to move onto other work, and you can’t do that on the same branch, so this looks like a useful application of feature branches to me, albeit due to your scenario. Have I understood the advantage you’re getting in that environment?
In our process, we don’t have any delay between when we finish work on a feature (including running the build locally) and when we merge/push it, so there’s no need for us to have a branch that we can park while we start work on something else.
Hi, I think “no need to park while start work on something else”, as well as “because a developer really ought to be working on only one thing.”, is for a perfect team.
Many thing like automatic test is not cover all code, team member are not in same level and something like wrong text from market team, make you need switch from unfinished work.
In those situation, branch is a good tools.
I’m not a native English speaker, so sorry for any misunderstand.
I’d say that using feature branches actually encourages agile practices. It forces you to keep your stories small so that they can be integrated on a regular basis. If the stories get too big then you aren’t really doing agile properly because you can’t integrate in-progress work with other teams without slowing down the release process.
That’s an interesting thought. I hadn’t considered that a team that is hell-bent on being agile might break their features down to eliminate this problem. However, if you’ve broken the features down to be that small (pushing at least once a day?), then what is the advantage that the feature branches provide over just pushing to master?
Integrating components written in parallel is not a new problem. Branches are not a new concept. It takes a lot of reaching to frame this as a issue with git.
You’re right: integrating components written in parallel is not a new problem. It’s an old problem, but one which continuous integration has rendered far less painful for many teams over the last decade.
And you’re also right that Git is not the problem. (I’ll admit here to my title being hyperbolic.) The tool itself is very rarely the problem. My point is simply that the rise of DVCS has created a renewed interest in using branches, and particularly feature branches, as central features of development processes and I worry that agile teams may take one these popularised practices without due consideration to damage they can do to continuous integration of value. It is really the hype that is the problem, and my aim here was to try to raise awareness among agile teams of the need to consider these things thoughtfully.
I don’t think easy branching and merging is the killer feature of DVCSs for agile teams. (Though I agree that’s often the leading feature when pitching these systems.)
To me the killer feature is being able to commit and integrate on a different cadence. Not that I want to integrate less often—I know that’s going to hurt later. Rather, I want to be able to commit locally more often so I can take very small, safe steps that are meaningful to me (or me and my pair). Then, I want to be able to integrate with my team when I’ve accumulated enough small steps to be meaningful from the perspective of the team and the larger system.
With Subversion, I might commit every 20 minutes or so. With Git, I’ll often commit every 2-5 minutes and push every 20. This gives me a better local safety net without adding noise for my team.
Thanks for your thoughts there, Richard. I know a few people who enjoy being able to “commit” to keep track of savepoints without sharing the WIP with others. I wrote a little bit about this above in reply to Sunny and explained how I don’t actually think of that as “branching” but as using git as a local VCS. In fact, there’s no requirement for branches in order to work that way. Your local clone of master is conceptually a branch of history until you either pull or push.
I asked Sunny about whether or not he likes to squash these “in progress” commits when merging and what he sees as the advantages and disadvantages. I’d love to hear your thoughts, too.
Cheers,
Graham.
You’re article reads like you have an issue to solve. We use feature branches – and yes, if they live too long, or if large refactorings are merged, they can become a p.i.t.a., but that happens only so often – our workflow feels smooth and efficient.
That said: we have a thorough QA process that calls for feature branches. But at the same time we try hard to make them short lived.
We also merge bug fixes, refactorings and tiny changes directly to master – and merge master on a daily basis into our feature branches, if they live that long.
I imagine, that merging always directly into master would:
1. lead to larger commits
2. lead to other issues
So, while solving one issue (which I do not see as significant in our team), I am sure other issues would be created. In our case I would expect bugs on live (as right now, we can deploy master to production whenever we want, as we know that it is thoroughly tested by QA).
Just make sure that you’re feature branches are short-lived and you should be fine.
Thanks for sharing what your team is doing Andreas.
I believe every agile team should work hard to make every engineering task both short-lived (as you are obvioiusly doing) and ready to go to production. A “commit” that is not ready to go to production is not a commitment at all, but just a savepoint, and rarely one worth sharing with other people. From that perspective, a local commit to master and a local commit to a feature branch make no difference.
I’ll take at face value that feature branches support your arduous QA process and that seems like a valid reason to have a non-trivial branching strategy. I’m confused about exactly what your QA team are testing, though. Are they testing the code on the feature branches, before it has been integrated with master and other feature branches? (This would suggest the possibility of the merged code on master not getting tested post-integration, which would be a bit of a worry.) Or is the QA team responsible for integration, such that they only test one feature branch at a time, with that branch having pulled from master immediately before testing and pushed to master immediately after? Or is it something else? I’m curious what you’re doing and would appreciate it if you could share with us how it works.
Cheers,
Graham.
I think your argument stands up more to web apps than other types of applications (shrink-wrapped software, mobile apps pushed to app stores, etc..)
There is a great presentation by Paul Hammond where he talks about how web applications are a different beast, and no current version control system really handles them correctly. Here is the slide deck, I highly recommend it, even if you don’t agree with it all: http://www.paulhammond.org/2010/06/trunk/alwaysshiptrunk.pdf
Feature toggles, branch by abstraction are better patterns for web applications I think.
Hi Jeff. Thanks heaps for that link. It was awesome. At Tyro, we are definitely in the category of pushing services to production (though not so much focus on webapps), and Paul has a lot of great suggestions for such an environment.
However, I don’t think the discriminator here is how and where you deploy your software. I am, after all, talking mostly about /development/, not deployment. I think the measure is: “How agile do you want your team to be?”
Agile is all about embracing change and making constant, small improvements to your software. The pace of release is usually an “iteration” or a “sprint”, and the pace of deployment may or may not be related to releases, but the pace of /change/ is measured by the integration of valuable changes into the mainline, which should be happening many times a day (assuming >1 developer).
There’s nothing stopping the development of shrink-wrapped software or mobile apps from happening this way. Those developing such software may shy away from this method of doing things; perhaps they have a deadline they can’t miss and want to be able to choose which features are in or out at the last moment, so every feature gets developed on a branch. At this point, whoever’s making that decision is choosing (consciously or otherwise) to not operate in an agile way, and that’s fine – agile is not a fit for every team or every context. I have no opinion on the value of feature branching in those environments.
For those who are trying to be as agile as possible, though – driving their software forward with constant valuable change to the integrated, ready-to-ship deliverable (a.k.a. ‘trunk’) – I believe feature branches are a siren luring them away from the value of sharing valuable code changes ASAP and towards the painful practice of delayed integration.
Pingback: Link Post: Are Git and Mercurial Anti-Agile? | This Programming Thing
Do you mind if I quote a few of your posts as long as I
provide credit and sources back to your website?
My blog site is in the very same area of interest as yours and my visitors would truly benefit from some
of the information you present here. Please let me know if this alright
with you. Thanks a lot!
my page – 6pm promo code
Pingback: 'The Lean Enterprise' - Jez Humble (Notes from YOW! 2014)Evolvable Me
Pingback: Git: Links, News And Resources (9) | Angel "Java" Lopez on Blog
Pingback: Using Microservices to Solve Developers Stepping on Each Other's ToesEvolvable Me