Can we make faster progress by measuring less?
This past week, I got to do something that brought me great joy: integrating and combining several old blog posts in a novel way that led to a new insight.
In my 2020 wrap-up post, I’ve highlighted “building human connection” as an area of interest of mine for 2021, clarifying further than:
I slot some of the more interesting challenges of distributed work into this category and the maturing DEI space which is finally generating some balanced, evidence-based approaches and practices.
The second half of that sentence is the topic for today. I want to argue that if you are a small-to-midsize company (say, under 1,000 employees), and keen on advancing DEI initiatives, you are probably trying to measure too much rather than too little. This inclination to make progress through measurement is deeply encoded in the modern business ethos and further amplified by imitating larger companies where a quantitative measurement is actually a useful tool. I’ve written more about this broader pattern in ending the tyranny of the measurable — this is just a more specific application.
Allow me to illustrate it with an example: suppose a company wants to improve the diversity in its hiring pipeline and reduce bias in its hiring process. Its first inclination would be to analyze its recruiting funnel and see whether different segments of candidates are treated differently.
In theory, that makes total sense. But then reality comes in and bursts our bubble. Legally, candidates cannot be required to provide segment data (gender, race, etc.) as part of their application process, leaving us with data that’s not just partial but also biased. There’s selection bias in the people who opted to volunteer this information (there’s also selection bias in the people who opted to apply to our open role to begin with). Even if we figured out how to overcome these challenges, we would run into the “small N” challenge. Because not many people go through our recruiting process, even if our analysis yields extreme or seemingly different results, they’re not likely to be statistically significant (more on this in surveys: exploring statistical significance). Our analysis is likely to suggest bias in the process even when there isn’t or not find evidence of bias even when bias exists.
So we’re going to cut through a lot of red tape to get the data, bend over backward to somewhat credibly analyze it, only to come up with results that at best are easy to poke holes in and at worse will be misinterpreted.
There’s got to be a better way. And there is.
I started unpacking an alternative way in if we know in which direction we want do go, does it matter where we are? and a few months later in the score takes care of itself.
All recruiting processes have some bias in them by the sheer fact that humans are involved in them. And there’s plenty of evidence that humans are biased. If our company thinks that we’re all some special bias-resistant snowflakes, no amount of data and analysis will save us.
The gist of “the score takes care of itself” approach is to hold people accountable for the behaviors that eventually lead to long-term success, rather than to a defined outcome with a fixed time horizon (like a diversity metric, for example). And as outlined in inclusive organizations change their systems, not just train their people, organizations are better off changing their systems to be more bias-resistant. It’s a lot easier to do than make people more bias-resistant.
Fortunately, again, there’s a substantial body of evidence available to us on how to build bias-resistant processes. The “bias interrupters” model is outlined in the link above, and specifically in the recruiting context, I’ve aggregated additional pieces in inclusive hiring: a short primer.
Back to our example. An alternative approach will be to audit our existing recruiting process against one, or both, of these benchmarks and start implementing a plan to close the gaps. We can be confident that we’re moving in the right direction since we’re basing our changes on battle-tested interventions.
To wrap up, a few words of caution and important caveats:
First, when picking evidence-based interventions, know how to tell the difference between “popular” and “effective”. There are a lot of popular but ineffective practices out there. “Because Google is doing it” is not a good enough reason to adopt a practice. Do your due diligence.
Second, measurement can be useful when used in the right context and in the right way. I’m not saying “measurement is evil, don’t ever do it”. I’m just inviting you when you’re quoting Drucker (“you can’t manage what you can’t measure”) also to keep Goodhart in mind (“when a measure becomes a target, it stops being a good measure”).