In the months following my most recent post about performance, I’ve been noodling on one key aspect of the challenge: if we take a more outcome (rather than output)-based approach to evaluating performance, how do we separate outcomes caused by luck and outcomes caused by skill?
I started off reading Annie Duke’s “Thinking in Bets” which have been sitting in my queue and received good feedback from colleagues. I finished the book somewhat disappointed having gained some good insights on pursuing the truth and having a stronger decision-making process, but not a lot that pertained to the assessment of performance and untangling luck and skill. However, Duke did mention another book with a rather promising title by Michael Mauboussin:
Weary of committing another big chunk of time to a book on this topic, I decided to look for more lightweight mediums and was able to find this Talks at Google video for the more audio-visual inclined and this 25iq blog post for the more textual inclined. Both provide good summaries of the major themes covered in the book. There’s a lot there, including lots of interesting tangents in their own right, but I’ll try to focus on one arc that’s relevant to my own area of inquiry.
Different domains of performance fall on a spectrum between pure luck and pure skill, but all of them have a combination of some luck and some skill.
As a domain evolves, it becomes more dependent on luck than skill. That’s not because skill matters less, but because knowledge dissemination happens more quickly and cheaply, causing skill to be distributed more uniformly. Mauboussin refers to that phenomenon as “The Paradox of Skill”.
The strategy for “how to get better” also varies depending on where the performance domain falls on the luck-skill spectrum. The closer it is to the skill edge of the spectrum, a deliberate practice strategy will yield better outcomes. The closer it is to the luck edge of the spectrum, the emphasis needs to be on a strong decision-making process. The latter helps frame Duke’s book more clearly: since poker falls closer to the luck edge of the spectrum, the heavy emphasis on the decision-making process makes a lot of sense.
It’s worth noting, however, that this point does not seem to be corroborated with a lot of evidence, at least in the resources that I reviewed (it may be treated differently in the book).
Mauboussin offers the following criteria for evaluating the process:
Analytical — finding edge and figuring how much to bet on that edge.
Behavioral — understanding the common biases we all tend to fall for and weaving into the process methods to mitigate and manage those.
Organizational — avoiding “agency costs” (misalignment of incentives). Is the organization helping or impeding the quality of the decision?
So where does all of this leave us with regards to performance management?
It supports the claim that variance in outcomes may have more to do with luck than skill.
This gets compounded in more mature domains where the “Paradox of Skill” is in full effect.
It supports the shift from focusing on the outcome to focusing on the process when evaluating performance.
Did it solve our problem? No. Did it get us closer to a solution? Yes. Baby steps…
Parabol’s team has been fully-remote for the last 5-years, so while many organizations had to transition to hiring remotely relatively recently, the Parabol team has had a few reps under its belt and it’s great to learn from their experience.
Jordan is an incredibly sharp thinker, so I’d highly recommend reading his post in its entirety to fully benefit from his deep observations. Below, I’d only outline Prabol’s hiring process at a high level and offer my perspective on it.
The application process is extremely lightweight: contact info, work eligibility, and relevant materials that the candidate thinks attest well to their fit for the role. Note that a resume is not required (but is an option), which I’m a big fan of as they’re often bad predictors of fit. The one tweak I’d offer here would be to offer a fast-track option (quicker application review time) that requires completing a short assignment, demonstrating deeper interest from the candidate.
2. Optional pre-screen
Throughout the process, there’s an intentional effort to not waste either the candidate’s or the team’s time and this is a good example of that. The outcome of reviewing the application doesn’t have to be a definitive pass/fail. If the outcome of the review is inconclusive, the team simply emails the candidate asking a specific question or requesting additional information, rather than forcing a definitive, suboptimal outcome — passing on a candidate who had a shot or wasting time with a borderline candidate.
3. Phone screen
A 30-min phone call (sometimes shorter) where the agenda is optimized to reject a candidate who’s not a fit as quickly as possible, by asking the biggest question first. Parabol’s “big question” is very straightforward:
Compared to your previous roles, what would you like to do more of and less of in your next role? And why does Parabol feel like a good fit for you?
However, it packs a lot of insights, allowing the team to get a rough assessment of the candidate’s self-awareness, motivation, alignment of interests, excitement about the opportunity, and level of verbal communication skills.
At the end of the screen, the baton for driving the process forward is passed to the candidate. If they’d like to move forward, they’re asked to send the team an email with any questions that they didn’t get answered today and want answered as part of upcoming conversations.
I LOVE this little tweak! Not only does it give the team a strong signal on the candidate’s level of interest in the role, and doesn’t waste their time with candidates that would just show up to the interview day because they were invited to one, it also, and perhaps more importantly, a deeply empathetic way to connect with the candidate, acknowledge that this is a two-way evaluation process, and in a small way, allows them to co-design the remainder of the process to fit their needs.
4. Skills assessment: 2 months, 2 weeks
A 30–60mins session in which candidates are asked to look critically at Parabol data and ask questions in order to create their own onboarding plan and scope out about 2 months of work.
Towards the end of the session, they are given a take-home assignment (that’s emailed back to the team once complete) in which they are asked to distill the plan down to:
The 3–5 things they’d like to get done in the first 2 months.
The 3–5 things they’d like to get done in the first 2 weeks.
I’m a big supporter of the overall approach of avoiding brain teasers and various whiteboarding exercises for assessing skills. However, there’s some nuance that’s not fully captured in the description of this step that may or may not cause it to introduce bias of its own.
Most of us are pretty bad in engaging with out-of-context hypothetical scenarios: thinking how we’d act in a situation we’ve never been in before, or how we’d solve a problem we’ve never solved before. This gets compounded if we have to do that “thinking on our feet” without time to fully digest the new situation and pattern-match it to a challenge we have been in before.
The “live” portion of the exercise outlined above runs that risk, though it can be mitigated by teeing up the conversation and sharing the data ahead of time. Recording the interview and making the recording available for the take-home assignment, as well as ensuring that follow-up questions are encouraged, can further mitigate some biases.
Personally, I’d still couple this exercise with a deep dive on a recent project that the candidate was involved with/led. Hearing the candidate truly in their element, speaking about something that they’re an expert on (their own experience) can be a good counterbalance for some of the challenges with the hypothetical exercise.
5. Cultural assessment
This is a 60-mins group session (a member from each team is present) aimed at assessing the candidate’s alignment with Parabol’s 3 core values: transparency, empathy, and experimentation.
The format uses “tell me about a time…” questions (“Can you think of a time when you last lost your cool?”) and follow-up questions to explore deeper (“if we were to ask the other person what their version of this story would be, what would it sound like?”).
The laser focus on values alignment, rather than broad and fuzzy “culture fit” is fantastic.
However, the method, as Jordan points out himself, is imperfect in ways that go beyond needing to be mindful that “absence of evidence is not evidence of absence”. “Tell me about a time” questions suffer from the same retrieval/out-of-context challenges as hypotheticals. I don’t keep a running list of times that I lost my cool in my head, and it may be difficult for me to think of one on the spot. Yet that has little to do with my actual alignment with the company’s values. “Tell me about a time” questions run the risk of assessing more for preparedness for answering the particular question than the essence of the response itself. An alternative approach will be similar to the one outlined in the previous section: asking broader experience questions and zooming in from there. For example: what did you like/dislike the most in your previous role? what were your greatest strengths/areas of growth in that role? what would your manager say if we asked him? what was your proudest achievement? They are not without fault of their own, but better than “tell me about a time” questions, in my opinion.
6. Contract-to-hire “batting practice”
Rather than forcing the team towards an expensively reversible “hire/don’t hire” decision, after the cultural assessment the team answers a different question, consistent with their experimentation/safe to try value:
Do we want to put some of our company’s money and more of our team’s time to try working alongside this candidate?
A 20-hour task is picked, often from the onboarding plan the candidate created in the skills assessment interview, and the candidate is extended a 2–4 weeks part-time contract to complete it, depending on their availability. At the end of the project, the candidate reviews the deliverable with the team and they conduct a shared retrospective, after which the team needs to make a unanimous decision on whether to extend a full-time offer to the candidate.
Conceptually, I’m a big supporter of this type of contract-to-hire assessment as a way to give both parties a better feel for what it would be like to work together. Practically, it can be a challenging commitment for many candidates with existing full-time jobs and family obligations.
My only other hope is that the team is embodying the “safe enough to try” value in their final decision as well, looking for consent, rather than consensus on that final decision.
Taking a step back, the Parabol process is a great blueprint for a highly effective remote hiring process. I’ve outlined the tweaks that I’d make to make it even better. You should consider your own. The only big thing that I would have liked to see more of, is carving out more time for the candidate to assess the company, not just for the company to assess the candidate. While I didn’t see it listed in the post, one way to go about it is still credited in my head to Jordan: have one of the interviews be an interview where the candidate explicitly interviews an employee in the company rather than the other way around. I’ll let this be my parting thought for this post.
The gist is pretty straightforward, Osman defines conscientiousness:
Conscientious people have a desire to do good work, and are self-motivated to perform well regardless of whether someone is watching over them. They are action-oriented, dutiful, and careful.
And makes a compelling case for why conscientiousness should be an attribute to look for in our hiring process. He then offers a sample set of questions that can help evaluate it.
Osman starts building off on Andy Grove’s framework for “effectiveness”, decomposing it into two main drivers: “skill” and “will”. Skill is decomposed further to a stable and general component — “intelligence”, and a dynamic and specific one — “experience”. The latter can grow over time, with more opportunities to perform the specific task. Similarly, Will can be decomposed further, to a general component — “conscientiousness”, and a specific component — “engagement”. Conscientiousness affects a person’s base level of motivation and how much they care about work, whereas engagement is more context-specific and can vary by the task at hand, relationship with their manager, current level of morale, etc. Osman posits that conscientious people may experience times of lower or higher engagement, but as a general rule of thumb, they always care about their work and perform it to the best of their ability.
Finally, and sadly, as somewhat of a disjointed afterthought, Osman highlights the importance of “values alignment”, which he distinguishes from the superficial/erroneous “culture fit”, as an additional hiring criterion but he doesn’t integrate it fully into the framework.
With the full 5-attribute criteria in mind: intelligence, experience, conscientiousness, engagement and values alignment; Osman observes that most strong recruiting processes do a good job evaluating for 4 out of the 5 attributes, but usually do not address conscientiousness. He offers the following questions as jumping-off points for assessing a candidate’s level of conscientiousness:
Ask them to walk you through a past failure — conscientious candidates will often define their failures by their impact on their commitments and will move mountains to avoid (or fix) such failures.
Ask them about a time they weren’t able to meet their commitments — a more specific version of the above question aimed at getting a more nuanced understanding of the way they view their obligations to others.
What motivates them to work, and what does success mean? — Conscientious candidates will have a more outward-facing view on success (impact on others/the company) and can often balance long-term and short-term success, avoiding short-term optimization.
Have them tell you about a time they worked on something they didn’t enjoy — Willingness to do unpleasant work if it’s important to their team or company is a positive sign of conscientiousness.
Look for evidence of side-projects or things that go above and beyond
What triggered them to leave past (or current) jobs, and how did they go about leaving?thoughtfulness about what they work on and deliberate regard for transition plans are additional positive signs of conscientiousness.
Personally, I’m not a big fan of out-of-context “tell me about a time when…” questions (#1, 2, and 4) since they often test recall abilities and favor candidates who luckily prepared for the specific question asked. But that can be easily addressed by starting with a broader question like “tell me about your most recent project” and going into more specific questions while already within that normal/fresh context: what worked well and didn’t well? (#1), did you have to reset expectations? how? (#2) what parts of the project were unpleasant? (#4).
Since conscientiousness is a Big 5 personality trait, another alternative would be to utilize a scientifically validated method for assessing conscientiousness.
I recently listened to a webinar by the team behind Variance which I found to be highly informative. The first part was an introduction to Product-led Growth (PLG) and Product Qualified Leads (PQLs) which is too far outside of the scope/focus of this publication to cover here but quite interesting for business nerds like myself.
This post focuses on the second part, delivered by Noah Brier and detailed in full here:
There were two highly useful knowledge management nuggets in that section that are worth highlighting:
Nugget #1: Writing is being used in the service of four different purposes
Writing to communicate — get ideas across.
Writing to converse — synchronous.
Writing to think — as a way to crystallize and firm up abstract ideas/connections.
Writing to archive/document— to make knowledge explicit and sharable.
It is often the case that writing that was used to serve one purpose cannot be used effectively to serve a different purpose. So the next time that you’re digging through a long Slack exchange (#2 writing to converse) trying to find that small bit about how to set up the environment variables so the software will work correctly (#4 writing to document) getting frustrated— you will know why.
Nugget #2: 6 rules of good documentation.
Digging deeper into the fourth purpose, Brier offers the following list as guidance:
Fit for context.
Clearly written and to the point.
Visual where possible.
Skimmable (can easily skip irrelevant sections).
Discoverable and tracked.
KM nerds can endlessly debate additions, omissions, and refinements to the list, but I think they’d agree that it’s a pretty great starter list. If your documentation checks the box on those 6 things — you’re in good shape.
I particularly appreciate including #5 and #6 on the list, which go beyond the way the text is structured to highlight a couple of additional elements that ended up tripping many documentation efforts that I’ve seen.
And as a useful double-click on #1 (fit for context), Brier offers the adaptation to a framework developed by Daniele Procida captured in the diagram above, distinguishing between different documentation artifacts depending on whether the documentation is aimed at helping the reader perform an action or understand a concept; and whether consuming the content is self-directed or guided.