Developer Productivity/Experience

There are many ways to skin a cat that will suit different organisations and teams

A team which is highly motivated, satisfied, works with tooling which enables them to ship quickly, and consistently delivers quality software is highly effective.

"Is my team highly effective?" is a question that I've asked myself many times and a lot of the time, it's hard to quantify. Usually, I would just be able to tell whether my team is performing well based on my gut feeling. However, this feeling is difficult to articulate, especially to senior management.

Metrics

Quantitative data is the first place I thought to look at as it's easy to understand. Good numbers go up and bad numbers go down. However, this didn't necessarily paint the most accurate/healthy picture of my team.

Common Metrics

This is a non-exhaustive list of ways I've either read metrics being used or have used/seen them being used personally.

Productivity Metrics

Pull Request (PR) cycle time - the time taken from the creation of a PR to the time it gets merged/closed
Lead time - the time taken from first commit to deploying to production
Time to open PR from first commit
PR review time
Time spent manual testing
PR size
Deployment frequency
Onboarding time to productivity - for example, the time taken for a new engineer to merge their 10th PR
Lines of code committed

Non-Productivity Metrics

Qualitative feedback collected using surveys
- Psychological safety
  - Taking risks
  - Giving feedback
- Interruptions to flow
- Context switching
Production incidents
Site up time
Reverts
Error volume

In my first attempt to look at the productivity of my team, I created a dashboard using data from Github for the following metrics:

Lead time
Deployment frequency
PR cycle time
Time to open PR from first commit

I understood that these numbers didn't truly reflect the overall productivity of my team, I used them as a starting point to bring to my team to dig deeper.

For example, if I saw the lead time increase for one of my teams, I would dig into the PR cycle time and time to open PR from first commit.

This would then be able to point me toward whether there was an issue with the review process such as poor code quality or an issue with the development process such as unclear/changing requirements.

Anti-Patterns

I attempted to use Lines of Code, PR size and Deployment frequency on an individual level for one of my teams and I received some negative feedback. My engineers felt that they were being micromanaged and were fearful of having a slow week or a complex task on their plate reflecting poorly in the numbers.

This was my first experience with negative feedback in how I tracked my team's productivity and I now realise that it is a common anti-pattern. Below is a list of anti-patterns I have read about/been guilty of doing myself:

Measuring the team on an individual level instead of at the team level - Avoid micromanaging individual engineers to create a environment where they feel psychologically safe to do their best work.
Avoid busywork/vanity metrics - The team/management must know why measured metrics are relevant to the team/business. Ensure that you're not just tracking things for because they look good but are ultimately worthless. This will cause the team to lose trust in you and management.
Using story points for velocity tracking - At best, story points are useful for sprint planning. However, things will change in the execution (last minute requests, new information) that will cause actual effort required to change.
Weaponising metrics without alignment from the team and management - If I were to track and raise concerns about a declining metric with one of my engineers from out of the blue, they would lost trust in me as a manager.

Using Frameworks with Metrics

Recently, I have stumbled across two productivity frameworks that are popular in the engineering management scene that are resonating with my current thinking and previous experience.

DevOps Research and Assessment (DORA)

Small changes with frequent releases

This is a framework developed for DevOps teams to think about their software-driven value delivery. There are 4 main metrics that make up DORA:

Deployment frequency - how often does the team release to production?
Mean time to recovery (MTTR) from a production outage
Change failure rate - of all release, how many contain defects?
Lead time - how long does it take for a commit to get to production?

These metrics are useful for teams who are not yet practicing Continuous Delivery (CD) but it was never intended to measure the productivity of a specific team.

While it's digestible to a wider audience outside of engineering, it's main purpose is for benchmarking your team across the organisation and industry.

SPACE

Combining self report data with automatic measures

SPACE is made up of five dimensions

Satisfaction & Wellbeing
- How fulfilled engineers feel with their work, team, tools or culture; wellbeing is how healthy and happy they are, and how their work impacts it
  - Employee attrition
  - Code review process
Performance
- The outcome of a system or process
  - Number of incidents
Activity
- A count of actions or outputs completed in the course of performing work
  - Code reviews completed
Communication & Collaboration
- How people and teams communicate and work together
Efficiency & Flow
- The ability to progress with their work with minimal interruptions or delays, whether individually or through a system
  - Uninterrupted coding time
  - Code review turnaround time

With these dimensions and the metrics inside, there exists a tension between metrics to avoid vanity metrics by introducing accountability into systems of measurement. Metrics about quality and speed are held in tension by metrics related to satisfaction and quality.

I feel that SPACE provides are more holistic and reflective measurement of the team that takes it beyond productivity to include the developer experience.

Surveys

Surveys are an important part of SPACE that could be incorporated into how a manager measures their team's health. This allows for capturing the perspectives and attitudes of the engineers themselves to gather fast, accurate data about perceptions and behaviours.

For example:

Telemetry tells you about the timeline for the release process
Surveys tell you if it's smooth and painless with has implications for burnout and retention

Survey design is a complex art form that if done poorly, could cause more confusion. Good surveys are valid and reliable at demonstrating good psychometric properties. In general:

Survey items need to be carefully worded and every question should only ask one thing
If you want to compare results between surveys, you can’t change any wording a tall
If you change any wording, do rigorous statistical tests

How to Get Started?

Assuming you use a code repository like Github or Bitbucket, you can access quantitative data either via API or using a pre-built connector like Stitch.

For small, scaling teams, you could look into:

Deployment frequency
Lead time
Developer onboarding - through both quantitative (e.g. time to first 10 deployments) and qualitative data (surveys)
Number of incidents
Satisfaction with developer tools and working environment

While these metrics aren't exhaustive, they provide a good mix of qualitative and quantitative data to point you in the right direction to ask deeper questions and address pain points.

As the organisation gets larger, you could look into formalising:

DORA metrics
Look into attrition
Allocation of work (features vs bugs vs maintenance)
Incident frequency
On-time delivery

Lessons Learned

Align on the definition of developer productivity with your team and management to ensure no one is taken by surprise during performance discussions and team health checks
Stop chasing single metrics - the health of the team cannot be boiled down to a single metric. Take a holistic approach to checking in on your team.
Balance self reported data with automatic measurements
Avoid busywork/vanity metrics
Don't let tools make decisions about what is important for your team
Use metrics to measure system efficiency, not individual performance
Use DORA metrics to benchmark performance
Development teams want to be consulted about their productivity and performance and not just informed

PreviousIntroduction NextOKRs

Last updated 1 year ago