Developer Productivity/Experience
There are many ways to skin a cat that will suit different organisations and teams
A team which is highly motivated, satisfied, works with tooling which enables them to ship quickly, and consistently delivers quality software is highly effective.
"Is my team highly effective?" is a question that I've asked myself many times and a lot of the time, it's hard to quantify. Usually, I would just be able to tell whether my team is performing well based on my gut feeling. However, this feeling is difficult to articulate, especially to senior management.
Metrics
Quantitative data is the first place I thought to look at as it's easy to understand. Good numbers go up and bad numbers go down. However, this didn't necessarily paint the most accurate/healthy picture of my team.
Common Metrics
This is a non-exhaustive list of ways I've either read metrics being used or have used/seen them being used personally.
Productivity Metrics
Pull Request (PR) cycle time - the time taken from the creation of a PR to the time it gets merged/closed
Lead time - the time taken from first commit to deploying to production
Time to open PR from first commit
PR review time
Time spent manual testing
PR size
Deployment frequency
Onboarding time to productivity - for example, the time taken for a new engineer to merge their 10th PR
Lines of code committed
Non-Productivity Metrics
Qualitative feedback collected using surveys
Psychological safety
Taking risks
Giving feedback
Interruptions to flow
Context switching
Production incidents
Site up time
Reverts
Error volume
In my first attempt to look at the productivity of my team, I created a dashboard using data from Github for the following metrics:
Lead time
Deployment frequency
PR cycle time
Time to open PR from first commit
I understood that these numbers didn't truly reflect the overall productivity of my team, I used them as a starting point to bring to my team to dig deeper.
Anti-Patterns
I attempted to use Lines of Code, PR size and Deployment frequency on an individual level for one of my teams and I received some negative feedback. My engineers felt that they were being micromanaged and were fearful of having a slow week or a complex task on their plate reflecting poorly in the numbers.
This was my first experience with negative feedback in how I tracked my team's productivity and I now realise that it is a common anti-pattern. Below is a list of anti-patterns I have read about/been guilty of doing myself:
Measuring the team on an individual level instead of at the team level - Avoid micromanaging individual engineers to create a environment where they feel psychologically safe to do their best work.
Avoid busywork/vanity metrics - The team/management must know why measured metrics are relevant to the team/business. Ensure that you're not just tracking things for because they look good but are ultimately worthless. This will cause the team to lose trust in you and management.
Using story points for velocity tracking - At best, story points are useful for sprint planning. However, things will change in the execution (last minute requests, new information) that will cause actual effort required to change.
Weaponising metrics without alignment from the team and management - If I were to track and raise concerns about a declining metric with one of my engineers from out of the blue, they would lost trust in me as a manager.
Using Frameworks with Metrics
Recently, I have stumbled across two productivity frameworks that are popular in the engineering management scene that are resonating with my current thinking and previous experience.
DevOps Research and Assessment (DORA)
Small changes with frequent releases
This is a framework developed for DevOps teams to think about their software-driven value delivery. There are 4 main metrics that make up DORA:
Deployment frequency - how often does the team release to production?
Mean time to recovery (MTTR) from a production outage
Change failure rate - of all release, how many contain defects?
Lead time - how long does it take for a commit to get to production?
These metrics are useful for teams who are not yet practicing Continuous Delivery (CD) but it was never intended to measure the productivity of a specific team.
While it's digestible to a wider audience outside of engineering, it's main purpose is for benchmarking your team across the organisation and industry.
SPACE
Combining self report data with automatic measures
SPACE is made up of five dimensions
Satisfaction & Wellbeing
How fulfilled engineers feel with their work, team, tools or culture; wellbeing is how healthy and happy they are, and how their work impacts it
Employee attrition
Code review process
Performance
The outcome of a system or process
Number of incidents
Activity
A count of actions or outputs completed in the course of performing work
Code reviews completed
Communication & Collaboration
How people and teams communicate and work together
Efficiency & Flow
The ability to progress with their work with minimal interruptions or delays, whether individually or through a system
Uninterrupted coding time
Code review turnaround time
With these dimensions and the metrics inside, there exists a tension between metrics to avoid vanity metrics by introducing accountability into systems of measurement. Metrics about quality and speed are held in tension by metrics related to satisfaction and quality.
I feel that SPACE provides are more holistic and reflective measurement of the team that takes it beyond productivity to include the developer experience.
Surveys
Surveys are an important part of SPACE that could be incorporated into how a manager measures their team's health. This allows for capturing the perspectives and attitudes of the engineers themselves to gather fast, accurate data about perceptions and behaviours.
Survey design is a complex art form that if done poorly, could cause more confusion. Good surveys are valid and reliable at demonstrating good psychometric properties. In general:
Survey items need to be carefully worded and every question should only ask one thing
If you want to compare results between surveys, you can’t change any wording a tall
If you change any wording, do rigorous statistical tests
How to Get Started?
For small, scaling teams, you could look into:
Deployment frequency
Lead time
Developer onboarding - through both quantitative (e.g. time to first 10 deployments) and qualitative data (surveys)
Number of incidents
Satisfaction with developer tools and working environment
While these metrics aren't exhaustive, they provide a good mix of qualitative and quantitative data to point you in the right direction to ask deeper questions and address pain points.
As the organisation gets larger, you could look into formalising:
DORA metrics
Look into attrition
Allocation of work (features vs bugs vs maintenance)
Incident frequency
On-time delivery
Lessons Learned
Align on the definition of developer productivity with your team and management to ensure no one is taken by surprise during performance discussions and team health checks
Stop chasing single metrics - the health of the team cannot be boiled down to a single metric. Take a holistic approach to checking in on your team.
Balance self reported data with automatic measurements
Avoid busywork/vanity metrics
Don't let tools make decisions about what is important for your team
Use metrics to measure system efficiency, not individual performance
Use DORA metrics to benchmark performance
Development teams want to be consulted about their productivity and performance and not just informed
Last updated