In Thomas Kuhn’s Structure of Scientific Revolutions, he observed that fields of study grows more complex until a key insight radically shifts the paradigm practitioners use, which creates new applications and research areas.
I believe that developer productivity (or DX for dev experience) is near a ‘peak complexity’ moment, and is poised for a key insight that radically simplifies our understanding. A sign that we’ve cracked the nut for a useful DX paradigm is when measuring and interpreting DX metrics is something that novice engineers can do, and leaders outside of engineering seek out those metrics because they easily connect to business goals (product quality, revenue growth, etc).
Currently, the DX metrics that are easy to measure (like ticket lead time and deploys per day) are bemoaned as “vanity metrics,” while what truly matters is complicated, nuanced, and hard to quantify.
To be sure, I think research like the DORA Report (with it’s 4 key metrics: Deploy frequency, Lead time for changes, Change fail rate, Time to recovery), SPACE framework (Satisfaction and well-being, Performance, Activity, Communication and collaboration, Efficiency and flow), and services like DX and CodeClimate are incredibly helpful and move the industry forward. I’m bullish on the future of dev tooling and full disclosure: I have a financial interest in DX the company. Despite their work, this tweet captures the status quo:
At GitHub we tried building a GitHub metrics product... it failed to gain adoption even amongst our internal teams because the metrics weren't actually useful.
— Abi Noda (@abinoda) April 26, 2022
Crazy how much $$$ is being spent on these types of tools just because leaders are desperate to measure *something*.
This tweet is so damning for the current state of dev productivity. A premier company that makes beloved products for developers, with smart and enthusiastic employees eager to improve DX, and whose platform contains a wealth of relevant data, couldn’t figure out how to make DX metrics useful for their own dev teams.
Is it simply impossible? Maybe. But I’m not ready to throw in the towel. To be sure, I don’t know what the new paradigm for DX will be — no silver bullets here — but I think I can describe a vision of what a useful paradigm would look like, and some strategies to work towards it.
Analogy: Core Web Vitals
Let’s examine an analogous domain: front-end browser performance recently went though just such a simplifying paradigm shift with Core Web Vitals.
Before Core Web Vitals, teams working on browser performance had a dizzying number of metrics to drive, some of which could be considered “vanity metrics” like payload size or number of backend requests. (I wrote about Glossier’s journey here.)
I see a few key features that enabled Core Web Vitals to revolutionize browser performance:
1. Metrics are easy to measure and comparable across organizations
Web Vitals focuses on 3 (admittedly complicated) metrics: Largest Contentful Paint, First Input Delay or Total Blocking Time, and Cumulative Layout Shift.
The first question a developer may wonder is “how does my site perform in these metrics?”
It’s so easy to find out! Use Chrome Dev Tools and Lighthouse to analyze any page, or use this Pagespeed widget. Because there are consistent APIs to gather these performance metrics, analytics products like Datadog and Calibre can incorporate these metrics into reports and dashboards.
Compare this to say, the DORA metrics and SPACE. Figuring out how to gather the data is an immediate obstacle. How do I design a survey and write good questions? Some metrics are difficult and nuanced to define, like what constitutes a ‘failure’ for a change failure rate. There’s little consistent tooling for how to measure these and make comparisons across organizations.
2. Metrics are highly correlated with successful outcomes
Imagine yourself in an adversarial mindset and try to improve a metric without improving the product. If it’s easy to do so, you’ve got a vanity metric that clever folks will inevitably game to their narrow advantage. Good metrics like Web Vitals are generally hard to trick, avoiding the moral hazard of vanity metrics.
But some DX activity metrics, like deploys per day or tickets closed, are fairly easy to game. Focusing on them can incentivize busywork over quality product improvements.
3. Compel organizations to report their data
In 2018, Google compelled web sites to invest in browser performance when they made it a factor in search and ad rankings. Marketing departments were suddenly more supportive of projects to improve page speed as it directly connected to the performance of an important channel.
Switching to the finance world, VC-backed and public companies are required to produce audited statements about their income, cash flow, and balance sheet. Investors and the SEC compel company managers to produce these documents because they’re so valuable to analyze a company’s performance.
Currently, developer productivity does not have such compelling pressure — it’s feels like a mostly insular discussion of software leaders and researchers discussing which aspects of the developer experience are integral or incidental, how to measure the experience, and how to improve it. We usually find ourselves trying to persuade others to invest in DX with the goal of accelerating product delivery, improving quality, and saving money.
Imagine how rapidly our DX research and practices could improve if there were such external pressure.
For example, corporate boards and investors could ask for employee survey data to show that the company’s workforce is well-supported and engaged. Engineers could self-report productivity data to aggregation services like Glassdoor and Levels.fyi, and high-functioning teams would be rewarded with more and better applicants. At first, only high-functioning teams would report. But soon, not reporting would be seen as a negative signal for potential hires, creating a virtuous cycle for companies to both share and improve their DX.
Prospective employees could better make apples-to-apples comparisons across teams with public quantitative data rather than semi-public qualitative reviews and backchannel advice.
In the language of Meadow’s Leverage Points, this would be changing “the goals of the system”. DX improvements aren’t just to improve and accelerated product development, but to prove to investors and future team members that the team is managed effectively.
To be sure, I don’t know what data to report. My hunch is that publicly reporting any data, even if they’re initially vanity metrics, will better incentivize and accelerate the discovery of more effective metrics and survey questions. Examples of analogous evolving standards are GAAP for financial statements, and CIS Controls in the security community.
In summary, I think we’ll be dramatically better at understanding developer productivity when:
- We have productivity metrics that are easy to measure consistently across orgs and tech stacks.
- We have a simple way to connect those metrics to positive business outcomes.
- We leverage external stakeholders to compel our organization to report on developer productivity (like hiring / review sites and investors).