Custom Blogger JS

Thursday, February 13, 2014

5 Steps for Using SAFe’s Weighted Shortest Job First Ratio (WSJF) for Backlog Prioritization

Exec. Summary:

By combining a ratio of business value to effort with the use of planning poker, a simple and standardized ratio can be used to help guide organizations in the process of backlog prioritization.

After wearing many SDLC hats over the past twenty plus years in the tech consulting arena including developer, data architect, systems analyst, business analyst, project manager, and product manager, I have come to greatly appreciate many of the principles found in the agile framework and its supporting techniques.  One of the more intriguing agile developments is applying agile not just to development teams but applying it upwardly throughout an organization including executive management.

Once such development in this field is Dean Leffingwell’s Scaled Agile Framework (SAFe) which uses interconnected Kanban and SCRUM boards to manage a company’s portfolio initiatives, programs, and associated projects.  A key idea here is that an organization’s initiatives can be broken into quarterly sized chunks of work,or epics at a portfolio level (management strategy), these epics can then be broken into monthly sized chunks of work, or features at a program level (coordination), and features can be broken into sprint sized chunks of work, usually two-week user stories, at the project team level (execution).  The actual durations of these three levels is really dependent on what your teams are comfortable with when doing work estimation.

Portfolio Pgm Project
Image Source: adapted from Enterprise Agile presentation.

One of the more difficult tasks to perform is a consistent way for management to prioritize business and architectural epic and feature backlogs. SAFe proposes an interesting approach called Weighted Shortest Job First (WSJF).  In a nutshell WSJF is a ratio of an item’s business value to the size of effort to implement it (duration).  Items with a higher ratio bring the greatest business value soonest and therefore should be implemented first.  If you are already familiar with these concepts and want to get to the 5 steps, you can go to Step 1 here.

SAFe suggests that business value be driven by a concept called Cost of Delay (COD). Essentially opportunity cost to delay implementation of some functionality.  COD will be specific to your situation.

SAFe breaks business value into three categories:

1) Value to the User and/or Business – This includes impacts of delays on revenue, expenses, penalties, market share.
2) Time Criticality to User and/or Business – The impact of implementation delay to deadlines, regulatory requirements, user trust, and other dependent projects.
3) Risk Reduction or Opportunity Enablement – The risk reduction or unlocked potential of business opportunities that may be lost or gained based on delaying the release of functionality.

For work effort, the bottom half of the ratio, ideally you want to express this in terms of time. 

So the resulting ratio of a functional requirement looks something like this:

WSJF = User/Business Value + Time Criticality + Risk Reduction / Time

Business value is measured in relative points, like Fibonacci story points, and time is measured in work duration. e.g. days, hours, etc.

The higher the ratio the higher the value of the item being evaluated should be to the business.

If some of these factors makes sense and some sound difficult to quantify, I wholeheartedly agree.

On a recent engagement with a well known movie and game rental company, let’s just call them Bedrocks,  I guided a pilot project to test out some SAFe concepts to see how well they would fare for managing several projects and providing prioritization guidance to the implementation teams. We liked the concept of the WSJF but arriving at the necessary values seemed challenging.
Participating teams included a newly formed Enterprise Product team,  and one of the architectural platform teams. Note that Bedrocks was just one of a few lines of businesses managed by Bedrock’s parent company. The Enterprise Product Team which was part of the parent company, oversaw product management for all lines of businesses, and the architectural platforms, also part of the parent company, served all lines of businesses as well.  This is actually a critical point because a method was needed to determine which line of business got access to a limited set of enterprise technical resources first.

We visualized our Epics on a Kanban board that used 4 quarters to plan our epics (Q1, Q2, Q3, Q4) and those epics linked to a child Kanban board of features represented in SDLC phases (Ready for Analysis, Analysis, Design, Build, Test, Ready to Deploy, Deploy). Don’t get too caught up in this part, this will be specific to your own environment and I’ll discuss visualization in another articleThe key to using this process is learning how to break down your work into manageable chunks that are understandable by the users  at the Epic level (executive & program/product management) and the users at the Feature level (program/product management and project management).
To attack the complexity of this process we followed the keep it simple rule.
Before even touching a Kanban board we created our own spreadsheet to manage program epics, features, and WSJF ratios. 

Here are the five steps we took to to aide us in prioritizing product features from multiple projects leveraging the Weighted Shortest Job First (WSJF) ratio.

Some Assumptions:

  • Executive Initiatives consist of one or more programs.
  • Programs consists of one more projects.
  • All organizational programs and their associated projects that affect our pilot implementation teams are known.
  • Epic sizes should be no more than 3-4 mos.
  • Features sizes should be no more than 4–5 weeks.
  • Product Owners, Product Managers, and Technical team leads and architects impacted by a program are part of the epic and feature breakdown, estimation, and prioritization process for that program. 
  • You may decide to include some relevant project managers as well if it will help you define the pieces of the project.

Step 1: Break your Programs into Epics

For each project within a program, everyone in the room will need to agree on the major phases of functionality required to implement each project.  So for example, if you are integrating a cloud-based HR system into your existing environment to replace a local legacy system, some of those hi-level phases might be something like this:
  • Environment Setup
  • Integrate vended application with existing legacy systems
  • Migrate existing employee & contractor data into vended application
Determine if each epic is inside approximately 3-4 months of work, preferably 3 or less.  This is where the tech leads and architects come in handy.
If not, break the items down further into sub-epics. Don’t get hung up on the durations just yet, you’ll be able to come back and correct this later.
So maybe further discussion produces the following:
  • Environment Setup 
  • Integrate vended application with existing legacy systems
    • Employee New Hire 
    • Employee Update
    • Employee Terminate
    • Contractor New Hire
    • Contractor Update
    • Contractor Terminate
  • Migrate Employees
    • Employee Core Data (name, role, team,  phone, supervisor, etc.)
    • Employee Benefits
  • Migrate Contractors
When you populate the WSJF spreadsheet, if something has a parent and sub-epic enter them in the Epic and Sub-Epic columns respectively.

Here are the ten epics we have so far:
Epic Sub Epic
HR Cloud Environment Setup n/a
HR Cloud Integration Employee New Hire
HR Cloud Integration Employee Update
HR Cloud Integration Employee Terminate
HR Cloud Integration Contractor New Hire
HR Cloud Integration Contractor Update
HR Cloud Integration Contractor Terminate
HR Cloud Data Migration Employee Core Data
HR Cloud Data Migration Employee Benefits
HR Cloud Data Migration Contractor Core Data

You might have expected me to reference each  legacy system that needed to be integrated with at this point, but I didn’t for good reason. In general, you want to express your epics and features in terms of something the user will understand when validating if the system works and users generally understand vertical slices of functionality easier rather than which systems or databases need to be plugged into. In the case of architectural epics, because the users who will verify the system works properly are technical people, the epics and features may appear to be very technical. The first epic in this list ‘HR Cloud Environment Setup’ is likely an architectural epic where features might be to ‘Setup Dev Environment, ‘Setup Test Environment’, etc. There may even be architectural epics to set up new data centers and/or middleware as well.

Expressing things in vertical functional slices also helps the user prioritize easier and let’s us know when something is truly done.
We’ll see shortly how we know which systems are affected.
Step 2: Break your Epics Into Features
Take each Epic and break them into logical chunks of work that can be accomplished in 3-4 weeks.
Let’s take a look at HR Cloud Integration – Employee Update
When we edit an existing employees name, team, phone, desk location, or supervisor, we may want that information to show up in the employee directory. Here are some sample features related to this epic.

HR Cloud Integration – Employee Update Features:
  • Cloud Core Data Update (name, team, role, supervisor)
  • Facilities Location Update (building, floor, office#)
  • Telecom Update (desk phone, co. mobile phone)
The HR system is the source system of data for things like name and team, but in some cases there may be other systems like Facilities, and Telecom which are the source system of an employees location and phone number respectively and changes to that data must be sent to the HR system and perhaps other consumer systems but we’ve still expressed this as vertical slices of functionality that may or may not be restricted to one system.  You may also have to break your features into sub-features if they don’t seem like something you can get down in about 4-5 weeks.

Starting to get the idea here?

Before you can continue, the technical leads and architects should go thru each feature and identify which teams may be impacted (i.e. have some work to perform) relative to each feature.  You can enter this in the WSJF spreadsheet template by listing each team in it’s own column heading and then placing an ‘X’ in the cell corresponding to a feature. Teams may be architectural teams, development teams, vendors, etc. I’ve hidden the other Epics for simplification.
Epic Sub Epic Feature Team A Team B Team C Vendor A
. . .
HR Cloud Integration Employee Update HR Cloud Core Data Update X X
HR Cloud Integration Employee Update Facilities Location Update X X X
HR Cloud Integration Employee Update Telecom Update X X
. . .

Identifying the teams will make it very clear what resources are needed, which systems are impacted, and where user stories needed to be written to complete a vertical functional slice. You could potentially list systems instead of teams but the way things were structured at Bedrocks there were dev teams responsible for systems and some teams that provided general platform support like Database Admin/BI, and Enterprise Infrastructure (networking). In a ideal agile world their would be one co-located team.

Step 3: Have your product owners use planning poker to assign feature business value.
Once you feel you’ve documented all of your epics and features it’s time for the fun part. If you discover some new epics and features later, no worries, you’ll just need add them and prioritize these as well.
Using the spreadsheet provided, your product owners, program managers, and possibly managers need to go thru each feature and assign a relative score based on the Fibonacci series in terms of value.
Everyone should understand the concept of cost of delay and discuss the different kinds of value that a feature may have.  We felt it was far easier to just have one score versus different scores for each type of business value as SAFe suggests.  To arrive at that one score when multiple people are involved, I highly recommend planning poker.  We used the following values (1,3,5, 8, 13, 21) but you can use any range you want.
By using planning poker, it provides a great platform for managers of different systems to discuss why they think one feature is more important than another.

Step 4: Have your technical leads use planning poker to assign feature T-shirt sizes (high level estimates).
The next step is to have your technical leads and architects identify high-level T-shirt sized estimates for each feature.  If you have more than one technical lead or architect involved in this process, I recommend using planning poker in the same way you did for business value. Another chance for great discussion amongst all of the tech leads. This also let’s different teams get a greater appreciation for what another team does and/or input it needs to do it’s job, especially when considering integration of systems.

Before beginning this step you’ll need to agree on your work-effort sizes, assign an expected duration to each size in days, and optionally a relative point value. 

Here’s how we did it:
T-Shirt Size Time (days) Story Points
XS 1 1
S 7 2
M 14 5
L 30 8
XL 60 13
XXL 90 21

T-shirt sizes provide a very general way of describing effort without getting into too much detail.
Step 5: Review your WSJFs and Feature Dependencies to determine the order of your Feature Backlog.
Now that you have identified the business value and rough time estimate for each feature, you can now calculate your Weighted Shortest Job First (WSJF) ratio. Using the provided WSJF spreadsheet, your ratio will be automatically calculated for each feature. 
Epic Sub Epic Feature Bus Value T-Shirt Size Time (days) WSJF Story Pts
. . .
HR Cloud Integration Employee Update HR Cloud Core Data Update 21 XL 60 0.35 13
HR Cloud Integration Employee Update Facilities Location Update 5 M 14 0.36 5
HR Cloud Integration Employee Update Telecom Update 13 M 14 0.93 5
. . .

As each feature is entered in your backlog you can now include the WSJF and optionally the business value and t-shirt size associated with that feature card.

An interesting side effect of using this approach is that some features will have really low values because they take a long time to implement. To increase the features value it will be necessary to break it down into smaller pieces that can be implemented in a shorter period of time. In the example above the HR Cloud Core Data Update was estimated at 60 days. Breaking it down into smaller vertical functional slices that can be completed in 30 days or less will make it easier to manage and likely increase it’s WSJF ratio.
You may also find a need to go back and re-evaluate the business value of some items that needed to be broken into smaller pieces.

The WSJF is not the only factor in prioritization and care must be taken to identify any dependencies amongst features as well. So even though Feature C may have a much larger ratio than Feature B, Feature C may not be able to be developed until Feature B is in place, or perhaps a particular team won’t be ready to implement something until later. In the WSJF spreadsheet provided I included a column to identify which release the feature is expected to be part of which should aide in identifying dependencies and focus your prioritization within a release.

Why feature story points?

Trying to track user story points by rolling it up to a program level becomes very complex and often meaningless because different implementation teams may have different meanings behind their point system. So a 13 pt. user story for Team A will have a different level of effort compared to a 13 pt. user story for Team B. To resolve this issue, by focusing on story points at the feature level using agreed upon values amongst tech leads and architects, we now have a normalized set of story points that can be used to track velocity and cycle time for features across different systems. SAFe discusses the need for normalization of points and I think this provides an elegant way of resolving this issue. Let individual teams manage their own story points, and tech leads and architects manage feature story points.
In order for this to work however, it is necessary for the tech leads and architects to have retrospectives to evaluate the accuracy of their estimation. Often teams that require integration never get a chance to review this kind of information. So even if you don’t use story points, it’s highly recommended that some time is allocated at the end of every quarter or release, tech leads review how well they did on estimation and are able to improve their estimates for the next release.

About the WSJF Spreadsheet Template:

The WSJF spreadsheet provided leverages Excel’s grouping function so that you can expand and hide epics and features. It may be something you don’t often use in Excel and can get a little tricky to handle sometimes,  but I think you will find it helpful once you get the hang of it. You may also find it useful to copy the Epic-Feature tab for each program or for each project.
It also includes an additional tab for tracking issues that come up during this entire process so you can table them at the time of discussion and move on. Lastly it includes a tab for you to calculate a team’s iteration cost, so you can do some quick epic and feature budget estimation.
Be sure to download it to use it, it will not be functional when viewing it in Google Docs.

Final Comments:

By breaking your work down into epic sized (<= 3 mos.) and feature sized (<=30 days) functional vertical slices, and having product management work with the technical leads to estimate business value and work effort using planning poker, you can arrive at a consistent way to manage and prioritize work across the enterprise. 

A major tenant of Agile is continuous improvement. It is expected that your process for implementing an approach which uses the weighted-shortest-job-first ratio for prioritization will require some changes to the steps and resources involved than what was presented here.   I recommend you start with a small pilot project first to work out the kinks.

I would greatly welcome your feedback and like to hear about your own experiences with backlog prioritization.


  1. Regarding "The WSJF is not the only factor in prioritization and care must be taken to identify any dependencies" I've to say that I've solved the problem by integrating the dependency into the Cost of Delay.

    I've called it Recursive Cost of Delay, it express the CoD of all dependencies applied in a recursive form.

    Formula is:

    RCoD = CoD (of current story) + RCoD dependency 1 + RCoD dependency 2 + ... RCoD dependency N


    Story 1: CoD = 4
    Story 2: CoD = 1, depends on Story 1
    Story 3: CoD = 2, depends on Story 2

    The RCod of Story 3 will be 2 + 1 + 4 = 7 instead of 2.

    The Story 3 is more important so it goes up in the backlog priority.

    If used with WSJF you have to use the RCoD in place of CoD like:

    WSJF = RCoD / Size

    1. Thanks for the feedback on handling story dependencies, great way of handling it.