Sometimes teams, and now well-accepted frameworks, misuse and misunderstand agile estimation concepts at a deep level, to the extent that all of the potential value of the practices is stripped away. This series will talk about theory, whys, and whats behind some of the concepts and practices, and allow you and your teams to take back the power and value of agile estimation techniques!
Remember, eight Story Points means two days for our team!
– Many an individual Scrum Team
With a team of 5 coders and testers, use a Velocity of 40 points per Sprint.
– A suggestion a currently well-accepted scaled agile framework might make
Both of these statements fall victim to multiple misunderstandings regarding Story Points. I’ll call the two scenarios described here “normalizing Story Points,” as that’s a reasonable umbrella term to cover them both. They are slightly different, though, so I’ll also cover some of the differences below.
There are fundamental issues with normalizing Story Points that remove the advantage of, and any possible need to use, Story Points – that is, normalizing Story Points also neutralizes Story Points – it makes them useless.
There are two key ideas that underpin the practice of using abstractions (like Story Points) to estimate. Sometimes I see teams and managers miss these through no fault of their own, either through a poor trainer/curriculum, or in the rush to learn the new mechanics of using abstractions and other concepts.
- You’re after innate sizing expressed in relative terms…building a space shuttle is innately “larger” – by whatever measure – than cleaning your garage, and that will remain true regardless of the number of space shuttles we build (a measure of experience and domain knowledge), how technology or tools advance (a measure of risk and familiarity), how many people work on building a shuttle (a crude measure/influencer of duration or throughput), etc. There are many debates about what a “Story Point” really is, but IMHO they’ve resolved to “some combination of risk, complexity, duration, unknowns, effort, etc.” – which is precise enough for me, because all of those things will positively correlate.
- You’re detaching the unit you estimate with from the throughput rate. We know the throughput rate will vary – consider inevitable situations like people being added and removed from teams, sick days, company events, holidays, vacations, weather emergencies, production firefighting, and a million other things that we’ll never be able to predict. So we want to track and use throughput separately, based on averages of historical reality (the measure here is Velocity, and the method is empiricism). That is, we don’t need to estimate directly in the unit we need to answer questions about (time), and there is significant advantage in not doing so (both empiricism and the luxury of not having to re-estimate based on team changes). Compare this to estimating directly in time, where you must build assumptions about team makeup and throughput (“How long for who? How long for this team of 4 people I know over here? OK, that’s one week then.”) directly into the size estimate itself, and therefore re-estimate when conditions change (“We only have 2 people on the team starting today? It certainly won’t take one week now!”).
Normalizing Story Points destroys both of those ideas.
When a team says “8 story points means 2 days for us,” they’re solving for time up front: x Points = y Time Units.
A suggestion like “with a team of 5 coders and testers, use a Velocity of 40 points per Sprint” is slightly different, and actually adds problems to the previous scenario. The suggestion is to solve for time and throughput up front. This is sometimes justified with the alleged ability to plan at a higher level (program/portfolio), by allowing someone to compare velocities across teams. At least in the former case, the team stumbled into that themselves – here, we actually have people advising to actively do it. Ouch.
So why does normalizing Story Points neutralize Story Points? If you say, “8 Story Points means 1 week for a team of 6 people,” and then make a bunch of estimates against items in a backlog with that in mind, you’re:
- No longer talking about some (healthy) combination of risk/complexity/duration/unknowns/effort/etc. innate to what you’re estimating, as you are probably making specific statements about duration or elapsed time only
- Actually directly estimating in time (“1 week”), so the abstraction (“8 Story Points”) is actually obfuscation at this point – and I’d call that negative value, for the record
- Making it impossible to base throughput on empirical evidence and to then answer questions about time later, as you’ve already estimated in time
- Building in an assumption that there will always be 6 person team(s) executing on this backlog, the exact sort of throughput assumptions that you *don’t* have to make when you solve for time by tracking Velocity as an average Sprint-over-Sprint (when you have empirical process control by allowing actual, historical team makeup changes, production firefighting, etc. to influence the average and “come out in the wash” over time)
- Assuming people that work in your company are fungible – any 6 person team is like any other 6 person team – and making that toxic and wildly inaccurate practice encouraged and explicit for everyone in the organization
I wouldn’t recommend it. And with a healthy understanding of the theory – why we use Story Points and Velocity, and what we’re trying to achieve with them – I think a person would naturally back away from this idea.
I would recommend pushing concepts and practices that we talk about at the team level (abstract estimates, solving for time later/velocity, empirical process control) up various program and portfolio levels, using more coarse units as you go up, and allowing for empiricism at that higher level. You actually need empiricism the most up there, where even more unknown variables and unpredictable forces are at play. The details here would be at least another blog post…or five…so for now, let’s stop here.