Agile – Speech Wrecko

“Sorry we don’t have enough resources, we only have four pairs” – As an engineering leader no other statement has made me cringe more. After all four pairs is a healthy sized team of eight developers.

Throughout my career I have run across CTOs, VPs, directors, development managers, teams, and individual developers who swear by pair programming with near religious devotion. Personally I’ve maintained a healthy dose of skepticism when it comes to pairing as an overarching development philosophy.

As an engineering leader my job is to build products that delight customers in the most efficient way possible. Anecdotally, pairing consistently costs more and hence seems irresponsible to use exclusively as a development technique. But admittedly anecdotal evidence is insufficient so I decided to dig through the research and see if I could find more empirical evidence to support my claim.

Background

Pair programming is an agile software development methodology where two programmers work on the same task using one computer and keyboard. One programmer is called the driver and operates the keyboard and does the primary coding work. The other developer, often called the navigator, is responsible for observing the driver and providing guidance in order to speed up problem solving, improve design, and minimize defects.

The potential negative impact of pair programming is immediately clear to most people. By applying two resources to a task you are effectively doubling the cost. So unless there’s an equal or greater improvement in other project variables, pair programming would be nearly impossible to justify. Exploring the problem through a project management lens, where we have three variables, cost (including resources), time, and quality/scope, If we double our cost we’d expect to see an equivalent decrease in time to deliver or increase in quality or scope (or some factor of each).

In mathematical terms let’s assume the value of any given project X is equal to a weighted linear combination of cost, time and quality/scope.

When pairing our cost is automatically going to double since we’ve applied two resources for a task that in theory can be completed by one.

In order for our project value to remain equal or be better we need our other variables to proportionally change in the right direction. For example if our project now takes 50% less time we could argue we net out even. Or if our scope or quality double, we would similarly be in a good position.

However, In my experience I’ve not seen pair programming live up to these expectations. Instead I’ve seen tasks or user stories take the same amount of time and produce similar results at nearly double the cost. But you shouldn’t take my word for it. Let’s review the literature and see what the experts have to say.

Research

There are actually a fair number of research papers that attempt to prove or disprove the efficacy of pair programming. That said, in my survey of the literature I found most of the research to be ill designed for comparison to real world corporate product development organizations. Specific issues include:

Developer Skills: Most of the studies rely on university students that shouldn’t be compared to seasoned professional developers.

Non Production Environments: The majority of the software used for evaluation is very far removed from real product development environments.

Organization Realities: Finally there is little or no accounting for organizational churn that happens in a real for-profit company

In spite of these issues it’s worth exploring these various research studies and the insights they provide on the impacts of pair programming.

Many of the research papers evaluate the impact of pair programming on effort, which in at least one paper is defined as two times the duration or time required to complete a given task [1]. Specifically, effort increases ranging from 15% all the way to 100% have been observed [2]. In one of the more well conducted studies an effort increase of 84% was seen [1]. Since we know effort is just twice the duration of a single developer we can actually do some math to figure out how much faster pairs complete a task versus a single developer.

Or by using our earlier project management equation, with a little rounding we can assume our pairing time weight would be roughly 9/10 the weight required for a single developer.

This is nowhere near the factor of 1/2 or less we said we needed to make pair programming cost efficient. Well if the research doesn’t support a sufficient decrease in time to completion perhaps there’s research indicating that a given project’s scope or quality will increase enough to offset the difference.

Unfortunately, once again the results are at best inconclusive, but in many cases support an actual decrease in scope and minimal or near zero increase in quality. For example in [2] a reported 29% decrease in productivity was measured for pair programming team when measured as a function of completed use cases.

Regarding quality, even in one of the more optimistic papers we only saw a 10% – 20% increase in quality (measured as test cases passed) [3]. According to [2], we only saw an 8% improvement in quality when measuring actual defects. While these improvements are non trivial, when combined with the time and scope metrics it remains insufficient to offset the associated costs.

Cherry Picking

“But aren’t you just cherry picking the worst examples to justify your case” you might ask? Not really because even in the most optimistic research studies initial results were usually much worse and only improved over time. For example in [3] initial increases in effort dropped from 60% to15% over time. Most of the research attributes these gains in effort to “pair jelling”. In other words, as the pairs get to known each other they become more efficient.

The problem with these studies is that they assume that once a pair jells the gain will hold. However in any real for-profit organization there is potential for high variability in projects and staff which means pair jelling is unlikely to be a one off cost. It is more likely a continuing cost to the business over time.

Several studies also point out that the value of pair programming decreases with simpler tasks [4]. Therefor one must consider the ratio of simple to complex tasks in any given development cycle in order to understand the long term impacts of pair programming. When I evaluated my own teams, I found multiple iterations where 75% of work items where smaller changes that could easily be tackled by a single developer in the same timeframe.

Finally, one paper [5] attempted to justify pair programming by evaluating Net Present Value (NPV). In this paper an argument is made that even if it costs more to pair program, faster time to market warrants the cost. I take issue with this calculation since it does not factor in the opportunity cost of having those extra resources not work on a different higher priority project.

For example if we take the reported 84% increase in effort and assume we finish our project in 9/10 the time of a single developer, we must ask ourselves what happens when a key customer asks for a critical bug fix? I can tell that customer to wait until I finish my current project or I can split my pair and work on both at the same time at the small cost of a 1/10 increase in duration. By splitting my pair I’ve delighted my key customer as quickly as possible at a trivial cost. Clearly you need to factor in the opportunity cost of not delighting that customer when evaluating the value of pair programming.

To Pair or Not to Pair

So should you pair or not pair? There are a lot of reasons a team might use pair programming. In some cases the cost / benefit tradeoff may be worthwhile. Pairing can be very effective at educating new team members, improving the skills of junior team members, cross training, and reducing the cost of complex tasks. If you take anything away from this post let it be:

Challenge the Efficacy of Pair Programming: If your team or engineering manager wants to exclusively use pair programming, don’t blindly accept it. Collect the data to validate if it is really cost effective

Pair when it makes Sense: Use pairing selectively when it makes sense including educating new team members, improving the skills of junior team members, cross training, and reducing the cost of complex tasks.

Factor in Opportunity Costs: Make sure you consider the opportunity costs of projects not being worked on when pairing.

In short don’t allow yourself to be swayed by a dogmatic insistence that pair programming is better. As a leader your job is to challenge your team to delight customers in the most cost effective way possible. Pairing should only be used if it definitively contributes to that cause.

References

[1] Arisholm, Erik, et al. “Evaluating pair programming with respect to system complexity and programmer expertise.” IEEE Transactions on Software Engineering 33.2 (2007). – Summary available at https://pdfs.semanticscholar.org/9787/c9663cad3a1c21550f2e5e365e70fd01d3aa.pdf

[2] Vanhanen, Jari, and Casper Lassenius. “Effects of pair programming at the development team level: an experiment.” Empirical Software Engineering, 2005. 2005 International Symposium on. IEEE, 2005. https://pdfs.semanticscholar.org/40dd/fa666bf367cfffaae421dbd3c6170a3e3dc3.pdf

[3] Cockburn, Alistair, and Laurie Williams. “The costs and benefits of pair programming.” Extreme programming examined (2000): 223-247. http://www.cs.pomona.edu/~markk/cs121.f07/supp/williams_prpgm.pdf

[4] Lui, Kim, and Keith Chan. “When does a pair outperform two individuals?.” Extreme programming and agile processes in software engineering (2003): 1011-1011. ftp://nozdr.ru/biblio/kolxo3/Cs/CsLn/E/Extreme%20Programming%20and%20Agile%20Processes%20in%20Software%20Engineering,%204%20conf.,%20XP%202003(LNCS2675,%20Springer,%202003)(ISBN%203540402152)(479s)_CsLn_.pdf#page=240

[5] Padberg, Frank, and Matthias M. Muller. “Analyzing the cost and benefit of pair programming.” Software Metrics Symposium, 2003. Proceedings. Ninth International. IEEE, 2003. http://wwwipd.ira.uka.de/Tichy/uploads/publikationen/32/metrics03.pdf

Anyone who has worked in an agile organization has found that certain projects don’t quiet fit the agile mold. Nowhere is this more apparent than with research oriented projects. After all if there is complete uncertainty in the scope and outcome of a project, as would be the case in a research project, how do you create user stories and estimate story points? And if you can’t create stories and estimate the associated costs how can you hold your team accountable, communicate status to the rest of the organization, and make cost / benefit tradeoffs? Simple! You can’t.

I’ve personally dealt with this issue after hiring several researchers to work on an agile software product team. Initially, I struggled to interleave our research projects with our other production work so I started looking for a solution. The answer to my problem came after reviewing the agile literature and the scientific method and concluding that research projects really just represent an extreme of what the agile process is ultimately trying to solve. Below I will walk you through how I arrived at this solution and details on how you can apply similar tactics in your own research organization.

AGILE PROCESS

Early in my career at Microsoft someone handed me a copy of Steven McConnell’s book Code Complete.

At the time my greatest take away from that book was the concept of the “Cone of Uncertainty”. The “Cone of Uncertainty” states that the uncertainty of a given project decreases as time progresses and more details are flushed out.

Historically the “Cone of Uncertainty” was dealt with by creating detailed upfront plans and using waterfall project management approaches. The trouble with those methodologies is that they’re extremely resistant to scope change. Largely because scope change reintroduces uncertainty.

The agile manifesto attempts to eliminate the “cone of uncertainty” problem by following the principle of “Responding to change over following a plan”. Most agile methodologies use some form of iterative development to reduce uncertainty, with the idea being that if you’re working on smaller well defined chunks of a larger project uncertainty is removed and the project can slowly adapt to changing requirements. Mike Cohn wrote in an article titled “The Certainty of Uncertainty”.

“The best way to deal with uncertainty is to iterate. To reduce uncertainty about what the product should be, work in short iterations and show (or, ideally give) working software to users every few weeks. Uncertainty about how to develop the product is similarly reduced by iterating. For example, missing tasks can be added to plans, inadequate designs can be corrected sooner rather than later, bad estimates can be amended, and so on.”

If I take the above information together I can conclude two things. First, the agile method attempts to reduce or eliminate uncertainty by making every project a function of smaller work items iterated over time. Or framed in mathematical notation:

Where: T = Max Iterations, M = Backlog, N = User Stories belong to M

Secondly, if a research project is really just a project with maximum uncertainty then the same framework should apply. Only there would be an unbounded number of work items over an unbounded amount of time. Or framed in mathematical notation:

According to this logic a research project should actually work within an agile framework. We just need to figure out how to construct M (i.e. backlog) and how to bound M and T (i.e. number of iterations).

SCIENTIFIC METHOD

So what are reasonable user stories for a research project and why are they potentially infinite? It occurred to me that research in general follows the scientific method and that the scientific method may be a good framework for story generation.

In essence the scientific method can be boiled down to three phases: a research phase, an iterative hypothesis testing phase, and a communicate or productize phase. The unbounded component of research is that many hypotheses end in failure leading to another hypothesis that must be tested and this can potentially go on ad nauseam. This provided me a compelling framework for how to break research into user stories.

The first story in any research project correlates to the first phase in the scientific method. This story should be a time bounded spike that frames the initial question, covers any background research, and has an acceptance criteria of generating the required stories for the next phase of the project, hypothesis testing.

The next set of stories are all part of the hypothesis testing phase. These stories include any development work required to test the hypothesis, any data collection required, running the tests, and analyzing the results. If the hypothesis proves false the team should circle back to the background research phase and continue on with the process.

The final phase in this framework is only relevant when a hypothesis is proven to be true. This phase contains multiple stories including any communication or publishing of results, IP protection, and a handoff to whomever might be building the final product (which might be the same team). The final handoff story should also be a spike and the acceptance criteria should include the user stories required for the production deployment.

BOUNDING AN UNBOUNDED PROJECT

Now how do you go about making sure research stories don’t go on forever? How do you bound T and M? And how do you communicate the cost / value trade offs with management?

I have found that the previously described framework only works if you apply the following guidelines in conjunction. Specifically

For any research project to be considered we must have enough information for the project to pass the “sniff test” (i.e. Is it possible in a reasonable amount of time and does it make business sense).
The initial estimate for research projects are based on the expected number of hypothesis iterations and the cost must be inline with the expected project value (i.e. if the research is perceived to have large value it may be worth iterating for a long time).
If the number of hypothesis iterations exceeds the original cost the cost/benefit analysis must be revisited and the project should be canceled if the cost has exceeded expected value.

CONCLUSION

What I have presented here is a process by which you can take an unbounded research project and place a structure around it that will work in companies using an agile development methodology. Besides allowing research projects to function in an agile organization this framework also provides a method for bounding research problems and communicating the cost / benefit trade offs to management and other relevant parties. For those who have faced similar issues integrating research oriented projects into an agile culture I hope this methodology provides some ideas on how you can better integrate research into your processes.

Category: Agile

Pair Programming or Bare(ly) Programming