The shift towards distributed/remote is causing many companies to revisit their core HR programs with the realization that simply “doing what we did before, but virtually” is not a long-term solution. In particular, engaging, developing and retaining top talent has been top of mind for many leaders since this shift unlocks new opportunities for these individuals, creating a critical “shields down” moment.
Luckily, this is another area where many companies can (should?) take a page out of GitLab’s handbook (pun intended) and consider implementing a CEO shadow program for their top talent:
In typical GitLab fashion, the link above describes their CEO Shadow Program in full transparency and almost-excruciating level of detail, resulting in an 8,5000-word document. Below I’m hoping to do my best in highlighting the key design elements that are worth considering when implementing a similar program elsewhere, using 95% fewer words.
The goal of the program is to help participants globally optimize their work by gaining a deeper understanding of how GitLab works, and what it aims to accomplish, as well as building cross-functional relationships with other members of the program’s cohort. For the CEO, it’s an opportunity to build a personal relationship with team members across the company and learn about challenges and opportunities through their unique perspective.
Eligibility & Application
In addition baseline tenure (at least 1 month, preferably 3) the eligibility criteria aim the program at employees on the managerial track with large spans of control, senior individual contributors, cultural leaders and under-represented groups (both minorities and geographies). The application process is driven by the employee, requesting a particular slot on the schedule, highlighting how they meet their eligibility criteria and providing confirmation from their manager that they meet the criteria.
The program runs on an on-going basis with a few rare exceptions (CEO PTO, etc.). Participants shadow the CEO for two weeks with one week overlap, so during the first week a participant is trained by the outgoing person and during the second week, they train the incoming person. This is a good place to highlight the extra consideration that GitLab made for parents — highlighting specific “parent-friendly” slots in which full week participation is not required or the weeks are inconsecutive, and paying for childcare while parents participate in the program.
Participants are expected to prepare for their time in the shadow program by connecting with co-shadows and program alumni, preparing a formal onboarding program (off a template), and getting up to speed on the work currently in-flight (projects and calendar) and the CEO himself.
The format is pretty straight forward. Participants are expected to perform a set of small administrative/operational/documentation tasks that can be completed during their time in the program and be part of almost any conversation that the CEO is having with a very small subset of explicit exceptions. Detailed guidance is available on everything from what to wear, through how to present yourself and act during meetings, to how to navigate the CEO’s home office (aka “Mission Control”). Of note is the awesome expectations to “speak up when the CEO displays flawed behavior” which probably merits a post of its own. The specificity in both describing them and validating/inviting the way you can respond to them is truly inspiring.
A CEO shadow program can be a phenomenal opportunity to help retain top talent. If you’re considering starting such a program in your own organization, the GitLab handbook page is a comprehensive jumping-off point that can help ensure that you got all your bases covered. You can refine, modify and iterate from there.
Tackling undiscussables, a set of issues that are holding the team back but the team is reluctant to discuss, is a good example of taking an interpersonal risk in support of the greater benefit of the team. The reluctance to discuss them often stems from fear that doing so will sap the team’s energy, surface unresolvable issues, or expose the person to blame for the part they played in creating the issue. Where in fact, it is often the case where tackling the undiscussables brings relief, boosts the team energy and generates team goodwill.
Toegel and Barsoux offer a taxonomy with 4 types of undiscussables, differing in their source, the way they should be approached, and the sequence in which they should be tackled:
You THINK but dare not say — risky questions, suggestions, and criticisms that are self-censored out of fear of the consequences of speaking. Often due to past erratic or uncharitable responses from team members. Beginning the fix: leaders can explicitly acknowledge they may unwittingly have created a climate of fear or uncertainty, invite discussion about sensitive issues, draw out concerns, promise immunity to those who share dissenting views, and lighten the weight of their authority in the room.
You SAY but don’t mean — spoken untruths. Discrepancies between what the team says it believes or finds important, and how it behaves. These issues are often left undiscussed not based on fear as much as on an unquestioned and distorted sense of loyalty to the team, its leader, or the organization. The intent is to maintain the team’s cohesion, even if that cohesion is based on a shared illusion. Beginning the fix: leaders need to make the hypocrisy of saying but not meaning explicit and acknowledge their part in the charade. Collecting anonymous examples of empty proclamations (“We say we want to…, but in fact, we….”) and challenging the overprotective mindset that inhibits the airing of criticism can kick-start the fixing process.
You FEEL but can’t name —negative feelings that are difficult for team members to label or express constructively, often failing to see the difference between manifesting one’s anger or resentment and discussing it. At a more basic level, they are not discussed because the antagonists experiencing negative emotions don’t test their inferences. Based on their own worldviews and self-protective instincts, they presume they know why the other party is acting in a particular way and let that drive their behavior. This leads to escalating tensions. Beginning the fix: help the feuding parties investigate the differences — in personality, experience, and identity — that sustain and fuel their apparent incompatibilities, rather than ignore the feud and the negative emotions associated with it. Enable them to share their experience while staying on their side of the net.
You DO but don’t realize — collectively held unconscious behaviors, such as instinctively developed defensive routines to cope with anxiety. Beginning the fix: Warped interaction patterns may be readily discernible to outsiders. A trusted adviser or an external facilitator can be invited to observe the team and give feedback on their communication habits through humble inquiry.
Toegel and Barsoux recommend tackling the “SAY but don’t mean” undiscussables first, as the gap is between two elements that are visible to all team members — the things we say and the things we do. And leaving the “Do but don’t realize” undiscussables for last, as they require outside intervention that will predicate on enough internal goodwill being built by tackling the other undiscussables first.
Lastly, Toegel and Barsoux offer a lightweight diagnostic tool, to help identify what type of undiscussables may be present in a certain team, based on the most common symptoms and team patterns:
Basecamp’s new email client is a tour de force of the “jobs to be done” approach in all aspects but one
I rarely cover specific products in this publication, but decided to make an exception this time for the following reasons:
It’s a collaboration product, and collaboration is a core organizational need.
It’s quite transformative in the way it approaches some big challenges with the existing solution (email).
Many of the issues it’s addressing and the approaches that it’s taking to solving them apply to broader collaboration mediums that extend beyond email.
You can check-out the feature-by-feature overview here or watch this tutorial video. I’ve decided to take a stab at organizing the features by the challenges that they are addressing.
Screen and triage
Emails from new senders arrive at a “screen queue” where they can either be rejected or accepted and triaged to one of a few “work queues”:
Reply later — non-urgent emails that require a response.
The feed —newsletter and other non-urgent recurring communications.
Set aside — short-term reference: information about an upcoming event/meeting, document that needs to be reviewed, etc.
Paper trail — long-term reference: receipts, reservation confirmations, etc.
Read and respond
Each work queue supports a slightly different workflow for handling the emails in it:
Imbox — split between “new for you” (unread) and “already seen”. A “read together” feature opens all unread emails in a single screen enabling the batch review of all unread emails together.
Reply later — offers a “focus and reply” feature, with similar UX to “read together” but adding a reply box to the side of each email.
The feed — a news feed view with the most recent email at the top offering a preview of each email and an ability to expand (“show more”) the whole email.
Set aside and Paper trail — a Pinterest board view with a visual digest of each email
In addition, email threads have the following additional features, impacting only the particular user (not all thread recipients):
Rename the subject line.
Merge threads of the same topic.
Notify when new responses are added to the thread or when an email is received from a specific sender. (notifications are off by default).
Clip (save for later) a portion of the email.
Add a “note to self” to the thread — “response” that only the user can see.
Add a “sticky note” to the thread that shows up under the subject line in the various work queues.
Send large files.
Unfollow (mute) — feature parity with existing solutions.
Labels (tags) — feature parity with existing solutions.
Finally, recognizing that email is also used as an archiving system, HEY added the following features:
“Paper trail” queue (discussed above).
List of all the content clipped from emails.
Contact view showing all correspondence that involved that contact, surfacing files separately.
There’s also a neat privacy feature proxing all images and therefore preventing unauthorized data collection using a tracking pixel.
There is, however, one area where HEY misses the mark and fails to extend the “jobs to be done” approach into a critical element of the user experience: migrating work into the service. It is unlikely that users will rush to leave their native email address and immediately start using they @hey.com address. For many of us, our email addresses serves as a better unique identifier than our physical address. We change apartments more frequently that we change our email address and therefore this change is associated with non-trivial switching costs. While HEY does support auto-forwarding of email from your native email address, it does not (as of the writing of these lines) support defining a different “reply from” email address. Therefore, email responses from HEY will go out from the @hey.com email address, creating a confusing and discontinuous experience to the email recipient.
In sum, HEY creates a transformational email experience by acknowledging three deep truths and building them into the user experience:
Different emails, different use cases — users engage with different types of emails in a different ways. We engage with an email from a good friend, a newsletter and last month’s internet subscription receipt differently. Therefore, the email client should support different workflows.
Different users, different preferences — traditional email forces complete symmetry in the way two people view the same email thread. HEY breaks that symmetry and allows users to modify and annotate the threads in a way that makes sense to them.
Communicate AND document — building on the framework I covered here, while email is primarily used to communicate, it’s a sufficiently good system of record for documenting/archiving some critical content. Supporting that secondary functionality needs to go beyond good indexing and a search box.
If we zoom out, these truths apply to other non-email collaboration tools such as chat and discussion boards as well. Therefore, considering similar approaches to addressing them in those other mediums opens some thought-provoking opportunities.
One of the beautiful things about HEY is that it’s a “dumb” email client from a technical standpoint. There’s no AI/Machine Learning involved in any of the features I listed above. It screens the emails you tell it to screen, it triages emails to where you tell it to put them, it notifies you about emails you tell it you want to be notified about.
On the one hand, it’s an incredible testament to how far just deeply understanding what your customers are trying to do with your product can take you.
One the other hand, it may also be HEY’s biggest business risk. A HEY subscription currently costs $100/yr. But if HEY starts getting serious traction, how hard would it be for Gmail to catch up?
Through not-the-most-scientifically-rigorous method (a survey on Twitter) Webber collected ~150 responses on different communities of practices capturing information about their community size, number of leaders and frequency of meetings. The results are captured in the following graphs:
Business communities of practice mimic natural social communities, sharing a similar fractal distribution of size. The intuition of drawing on insights from natural social communities when addressing issues with business organizations have existed for a while and provides some additional evidence for the validity of that analogy.
There’s a notable community threshold at about 40 participants. Below that threshold it’s more likely to see purely democratic (leaderless) communities and they meet fairly frequently (monthly or less). Above that threshold it’s more likely to see more definitive “leader” roles and meeting frequency increases substantially.
The 40 person threshold was interesting to me as it’s notably smaller than the Dunbar Number that’s estimated to be around 150. While it’s not discussed in the article, I’d hypothesize that the looser nature of the business community leads to the need to introduce structure in order to maintain them sooner (at smaller scale).
The other piece I want to cover today is by FabRiders titled:
It notes the recurring conflation of terms between “community” and “network” and posits that the short longevity of many efforts to create self-sustaining peer expertise exchanges is a result of unrealistic expectations for having those networks become real communities.
After providing a few “classical” formal definitions of community, they make a minimally compelling case for “network“ being a better framework for thinking about these groups than “community” arguing that:
We should not have expectations that a group of people coming together to share expertise will form a community, and in particular, that it will become self-sustaining… [peer expertise networks] provide an ability to establish connections that can deliver knowledge sharing in ways that strengthen the communities we are aiming to serve.
While there’s definitely some merit in this argument where the knowledge gain through that group is often used elsewhere, earlier in the post Paul Jame’s definition of a community is also covered:
It is a group of people who are connected by durable relations that extend beyond immediate genealogical ties, and who mutually define that relationship as important to their social identity and practice.
For many of us, our professional identity plays a big role in our overall sense of identity. While the criticism of it playing an outsized role is mostly justified, the activity that we spend such a large portion of our waking hours engaged in should play a meaningful role in our identity.
So while other references to “communities”, for example around certain product brands, easily fail both tests, the professional community seems more complex.
The distinction between a peer expertise network and a community is helpful, and I wish that the authors had provided clearer definitions of each. Yet as the article does point out, regardless of how the group is labeled, a clear and focused group purpose will be essential to its success.
Psychological safety is one of the hottest terms in the People field in recent years, yet there’s still a lot of ambiguity about what it means and how to create it. Shane Snow took a good stab at advancing this conversation in:
Snow starts off with Edmundson’s definition of “a shared belief held by members of a team that the team is safe for interpersonal risk-taking.”
A big chunk of the ambiguity around psychological safety stems from the various ways in which “safe for interpersonal risk-taking” can be interpreted so he offers two powerful distinctions to make reduce some of it:
Safety is not comfort (and discomfort is not danger)— You can be safe and uncomfortable. As a matter of fact, those are the required conditions for growth experiences. He illustrates that using the 2×2 above and offers a simple example of working with a personal trainer at the gym — you’re safe, but uncomfortable. Conflating the two terms leads to an overly broad definition of safety which reduces psychological safety: you view other’s disagreement with you as risking your safety, and/or are afraid to voice your disagreement in order to not jeopardize the safety of others. He references Haidt and Lukianoff’s work which discusses the downsides of mistaking cognitive friction for violence at length.
Not all interpersonal risk-taking is good — interpersonal risk-taking for the sake of interpersonal risk-taking is not helpful. Intentionally not delivering on a commitment, or shouting down someone who says something uncomfortable requires taking an interpersonal-risk, but it’s not taken in support of the overall benefit of the group so it’s not helpful. Suggesting a new idea, or voicing your disagreement also requires taking an interpersonal risk. But that risk is taken in support of the overall benefit of the group.
From there Snow goes to explore the relationship between psychological safety and trust:
I’m not sure that I’m bought into this distinction where trust is an attribute of the relationship between two people and psychological safety is an attribute of relationship between the whole group, since the relationship between the whole group is the sum of the relationship between every two people in the group. However, exploring that analogy does lead him from what psychological safety is to how it gets created, and the critical role that a benevolent or charitable disposition plays in that process.
But as we think about the behaviors that suggest high psychological safety, a charitable disposition seems to be insufficient. Here, I want to add and extend Snow’s work and I think the hint for the missing ingredient can be found in Google’s definition of psychological safety: “team members feel safe to take risks and be vulnerable in front of each other”. A charitable disposition gives you guidance on how to respond to others, but doesn’t provide you with much guidance on how to engage/show up yourself. This is where vulnerability and the importance of personal disclosures come in.
If I was to sum up the behavioral guidance for creating psychological safety, it would be: show up vulnerably; respond benevolently and charitably
Rephrasing some of Snow’s examples, this is how it’d look like:
Admit your mistakes; don’t hold others’ against them personally.
Speak up if you think something is wrong; don’t use others’ speaking up against them.
Ask for help when you need; support others when they ask for it.
Confess when you’ve changed your mind about something; applaud others’ intellectual humility when they change theirs.
Weigh the best interests of the group when making a decision; trust that they do the same.
This piece turned out to be trickier to write than I initially envisioned, since it required a delicate balancing act of not throwing the baby out with the bathwater. Since it’s a highly insightful “baby” and there’s quite a bit of “bathwater”.
Following one of Mark Murphy’s Forbes articles, I came across this more detailed blog post:
The team at Leadership IQ analyzed results from ~11,300 responses to uniquely-designed engagement surveys, exploring the correlation between an outcome engagement metric and two groups of engagement drivers:
Traditional engagement drivers — which focus on the support provided to the employee by their manager and the organization: “my manager recognizes my accomplishments”, “my job responsibilities are clearly defined”, etc.
Self-engagement drivers — 18 outlooks and attitudes over which employees have direct and personal control: “I expect that more good things will happen to me than bad things” (optimism), “I will succeed if I work hard enough” (internal locus of control), etc.
First, the outcome (dependent) metric they chose really resonates with me: “working at my company inspires me to give my best effort”.
While there’s no perfect way to describe engagement, this definition which focuses on discretionary effort maps neatly to the “Will” component in Andy Grove’s equations of:
If we are gearing our working on work efforts towards a specific end-state, a state in which we are all giving our work our best efforts fits the bill a lot more cleanly than an eNPS metric (“How likely are you to recommend your company as a place to work?”).
Second, despite some analytical shortcomings (more on that soon), their results provide some compelling evidence that several self-engagement drivers (optimism, internal locus of control, resilience, assertiveness, and meaning) correlate with the engagement metric more strongly than some traditional drivers (receiving recognition, openness to ideas, supervisor trustworthiness, teamwork, clear job responsibilities).
Third, the team does not view the internal outlooks and attitudes as fixed, but rather as elements that can be honed and changed through training and coaching.
Fourth, the team does not advocate for replacing the traditional view of engagement with the self-engagement one. It advocates for taking a holistic approach that addresses both traditional and self-engagement drivers.
There are a few critical gaps in the way the study was conducted that, on aggregate, reduce its overall level of rigor below the threshold that will automatically receive my stamp of approval.
First, there are places where the distinction between traditional drivers and self-engagement drivers get murky. For example, “I trust my immediate supervisor” is considered a traditional driver that’s purely a factor of the supervisor’s behavior, ignoring the way the employee’s overall trust disposition may mediate that perception.
Second, the decision to look at the correlation between the engagement metric and each one of the drivers in isolation is rather peculiar. A more insightful and rigorous analysis would regress all drivers against the engagement metric, and ideally, perform some factor analysis on drivers ahead of that to address any shared unobserved factors.
Third, the partial presentation of the results, discussing only 10 out of about 30 drivers, and constructing a narrative in which a specific traditional engagement metric is compared head-to-head against a specific self-engagement mertic, raises concerns around cherry-picking the results in a way that best supports the overall narrative.
It’s sad that the shortcomings in analysis limit the insights that can be drawn from the study. Yet, nonetheless, the following can be stated with a high degree of certainty:
Self-engagement drivers have a significant impact on overall engagement outcomes, namely our willingness to exert discretionary effort in doing our work.
Engagement surveys/reflections that leave out self-engagement drivers will reach partial insights that can only drive sub-optimal actions.
The criticality of self-engagement outlooks and attitudes coupled with their plasticity, creates an opportunity to strategically align learning and development efforts with the overall organizational effort to improve engagement.
Since many of you will not have the patience to read through a 68-page paper, below are the key highlights from this study.
The overall premise
The team starts off framing the challenge: various empirical studies have found that feedback interventions designed to illuminate employees’ blind spots don’t always yield the desired outcomes. Feedback from others is intended to motivate improvement but it often has a demotivating impact, even to high-performing employees. One meta-analysis found that one-third of feedback interventions actually resulted in lower post-feedback performance.
Academic research of the reasons for the mismatch between intention and outcome has centered around a few mediating factors:
Poor design of the instruments/programs — for example, utilizing performance review for both compensation changes and developmental purposes leading to positively skewed feedback.
Poorly executed feedback — the feedback itself is not captured/delivered effectively: the content itself is garbled or confusing, numerical ratings without behavioral guidance on how to improve, etc.
Contextual features affecting the feedback — things that are happening outside of the feedback itself, such as the organizational culture, or level of trust, or even things outside the organization such as the external economic environment leading to distorted feedback.
While these three factors do offer paths for improving feedback effectiveness, in aggregate there is little evidence that feedback interventions, even using some of the best practices on these fronts, have systematically led to organizational level benefits.
The team, therefore, proposes a fourth driver which they set out to explore: the discrepancies (blindspots) identified by the feedback are demotivating. The discrepancies, the gaps between the way the person views themselves, and the way they are reflected through the feedback received from others, represent threats to recipients’ positive self-concept. Because the self-concept is maintained and evolves through interactions with others, feedback recipients will try to avoid these threats by minimizing the way they collaborate with peers who provided them with disconfirming feedback, often in ways that lead to a reduction in performance (which heavily relies on effective collaboration).
The theory can be illustrated as follows:
To explore the theory in more detail, the team formulated the following hypothesis:
People are likely to perceive disconfirming feedback as more threatening to their self-concepts than feedback that is not disconfirming.
People are more likely to eliminate a discretionary relationship with a person providing disconfirming feedback than they are to eliminate a discretionary relationship with a person providing feedback that is not disconfirming.
The perceived threat to one’s self-concept mediates the relationship between disconfirming feedback and the elimination of a discretionary relationship.
The greater the number of a person’s obligatory reviews are disconfirming, the greater the negative change in future constraint.
Eliminating discretionary relationships with individuals who provided disconfirming feedback is negatively associated with subsequent performance.
Decreases in constraint in response to disconfirming feedback by obligatory relationships is negatively associated with subsequent performance.
For simplicity’s sake, I would describe the “decrease in constraint” mentioned in #4 and #6 as a change to the collaboration pattern between the recipient and the individuals who provided the feedback.
The team ran two experiments to test their hypothesis, a field study, and a lab experiment.
The team tested hypotheses 2,4, 5 and 6 using data collected from “a vertically integrated food manufacturing and agribusiness company located in the Western United States”, which given additional details about the way the organization runs, revealed it as The Morning Star Company.
Morning Star uses a fluid organizational structure where every season each employee signs a “Colleague Letter of Understanding” (CLOU) with the employees that they’ll be collaborating with during the season. These data, providing insights into the dynamic changes in collaboration patterns across the organization by applying Organizational Network Analysis techniques, was the critical piece connecting the more standard inputs (feedback data) and outcomes (bonus allocations as a proxy for change in performance).
The team tested hypotheses 1,2 and 3 through a well-crafted lab experiment.
Filtering for people who value creativity and view themselves as creatives, participants were invited to perform an online assignment in which they’ll be paired with another participant and randomly assigned to be either the writer or evaluator of the task. However, all participants were assigned to be writers, with the software playing the role of the evaluator.
In the first task, participants were given 5 mins to write a creative short story that is at least 200 words. There were then assigned to one of the two conditions, receiving either confirming or disconfirming feedback. They were then presented with the second task, answering 10 trivia questions under time pressure. If both they and their partner will answer both correctly — they’d be given a bonus payment for their participation.
BUT they were also given a choice: whether to stick with their current partner or be randomly assigned a new one.
Results and Conclusions
The field study found a strong positive relationship between disconfirming feedback and the likelihood that the individual receiving the negative feedback drops the relationship in the subsequent year (Hypothesis 2). And the greater the number of an employee’s non-discretionary reviews that are disconfirming, the lower that employee’s constraint in subsequent periods (Hypothesis 4). The results suggesting that the greater the extent to which individuals engage in the practice of dropping discretionary relationships that provide disconfirming reviews, the lower their performance will be in the subsequent year, were not statistically significant, therefore rejecting Hypothesis 5. The team did find evidence that decreases in constraint in response to disconfirming feedback by obligatory relationships were negatively associated with subsequent performance (Hypothesis 6).
The lab experiment showed the participants did find the disconfirming feedback as more threatening (Hypothesis 1, an average score of 2.7 vs. 1.5 on a 7-point scale), and were more likely to switch partners for the second task (Hypothesis 2, 30% vs. 9%). Finally, the perceived threat mediated the relationship between disconfirming feedback and the elimination of the discretionary relationship (Hypothesis 3).
The team articulates the conclusion from their study succinctly
Feedback processes are nearly ubiquitous in modern organizations. Managers employ these processes naively, assuming employees will respond to them with dutiful efforts to improve. But we find that disconfirming feedback shakes the foundation of a core aspect of employees’ self-concept, causing them to respond by reshaping their networks in order to shore up their professional identity and salvage their self-concept. This reshaping of employee networks contributes to lowered performance — a result ironically at odds with the ultimate goal of performance feedback. Our research offers an expanded view of social capital in interpersonal settings, and suggests that organizations must finds ways to fulfill employees’ need for a socially bolstered self-concept — that developmental feedback in the absence of this self-confirmation offers little hope for improving performance outcomes.
For the more textually inclined, full transcript below:
What makes for good leadership? That’s a question that’s in the air right now, and rightly so. I’ve got some thoughts that have been running through my mind lately, that I’d like to share. There are more nuanced ways to talk about this, but this is not the time for nuance. I think the kind of leader that you are or that I am has everything to do with the way we envision the ship or the vessel that we’re leading. And we might not have thought of this until now, but it is time, right now, that we do.
Let’s start with the word itself, leader-ship. There’s “leader” and there’s “ship”. In my view, the leader dimension has two elements: direction and connection. However he or she may come to it, a good leader offers guidance or direction for some kind of group journey. He or she points the way forward. “Let’s go this way! This would be a good way to go.” Then, there’s the connection part. A leader acts towards others and encourages people to act towards each other in a way he or she believes would make their journey go better. So while we’re going — let’s relate to each other a certain way, let’s connect with each other a certain way. Our journey will go better if we do, and I’m going to do my best to demonstrate that in my behavior. So that’s the “leader” side of leadership. it’s about direction and connection.
But what about the “ship” side of leadership? This is where it gets really interesting to me. To exercise leadership suggests that someone gives direction, etc. aboard some kind of ship, some kind of vessel. They are the leader of a ship. And when I think “ship”, I think of some kind of water-borne vessel, don’t you? You know, a ship.
Do we envision ourselves as leading a kind of cruise ship, where some people on board, the more privileged ones, can pretty much do whatever they want, whenever they want to do it? And the others are there mainly to serve them. Or do we envision our vessel as more like a small boat? A small boat where each person’s actions directly affect the others on the boat. And also impacts the stability and the seaworthiness of the boat itself.
I’ll leave it to your imagination to extend this metaphor further. And to apply it in some way if you think it’s useful. But this distinction between leadership according to the principles of a cruise line, say the Titanic, perhaps. And leadership according to the principles of a smaller boat, seems relevant to me. And I invite you to consider what makes more sense now. What fits better with the reality of the world as we’re experiencing and seeing it. Perhaps, with fresh eyes, right now.
Setting aside the irksome word-play (leader-ship) and my qualms with the “leader” definition, I find the boat metaphor quite compelling. The cruise ship, in particular, seems to capture many of the ills that sometimes plague large organizations, beyond the leisurely purpose of the journey itself… Specifically, the stratification of membership into two classes: staff and passengers. Though often the “staff” are the privileged ones: ignoring what the “passengers” can contribute to steering the boat towards its destination, and optimizing solely for the “passengers” satisfaction/happiness. Often this a byproduct of failing to evolve the “employees as users/customers” metaphor from metaphor to analogy.
As Fleming suggests, this is a fun one to play around with, reconciling contradictions, making distinctions, and drawing insights.
In the months following my most recent post about performance, I’ve been noodling on one key aspect of the challenge: if we take a more outcome (rather than output)-based approach to evaluating performance, how do we separate outcomes caused by luck and outcomes caused by skill?
I started off reading Annie Duke’s “Thinking in Bets” which have been sitting in my queue and received good feedback from colleagues. I finished the book somewhat disappointed having gained some good insights on pursuing the truth and having a stronger decision-making process, but not a lot that pertained to the assessment of performance and untangling luck and skill. However, Duke did mention another book with a rather promising title by Michael Mauboussin:
Weary of committing another big chunk of time to a book on this topic, I decided to look for more lightweight mediums and was able to find this Talks at Google video for the more audio-visual inclined and this 25iq blog post for the more textual inclined. Both provide good summaries of the major themes covered in the book. There’s a lot there, including lots of interesting tangents in their own right, but I’ll try to focus on one arc that’s relevant to my own area of inquiry.
Different domains of performance fall on a spectrum between pure luck and pure skill, but all of them have a combination of some luck and some skill.
As a domain evolves, it becomes more dependent on luck than skill. That’s not because skill matters less, but because knowledge dissemination happens more quickly and cheaply, causing skill to be distributed more uniformly. Mauboussin refers to that phenomenon as “The Paradox of Skill”.
The strategy for “how to get better” also varies depending on where the performance domain falls on the luck-skill spectrum. The closer it is to the skill edge of the spectrum, a deliberate practice strategy will yield better outcomes. The closer it is to the luck edge of the spectrum, the emphasis needs to be on a strong decision-making process. The latter helps frame Duke’s book more clearly: since poker falls closer to the luck edge of the spectrum, the heavy emphasis on the decision-making process makes a lot of sense.
It’s worth noting, however, that this point does not seem to be corroborated with a lot of evidence, at least in the resources that I reviewed (it may be treated differently in the book).
Mauboussin offers the following criteria for evaluating the process:
Analytical — finding edge and figuring how much to bet on that edge.
Behavioral — understanding the common biases we all tend to fall for and weaving into the process methods to mitigate and manage those.
Organizational — avoiding “agency costs” (misalignment of incentives). Is the organization helping or impeding the quality of the decision?
So where does all of this leave us with regards to performance management?
It supports the claim that variance in outcomes may have more to do with luck than skill.
This gets compounded in more mature domains where the “Paradox of Skill” is in full effect.
It supports the shift from focusing on the outcome to focusing on the process when evaluating performance.
Did it solve our problem? No. Did it get us closer to a solution? Yes. Baby steps…
Parabol’s team has been fully-remote for the last 5-years, so while many organizations had to transition to hiring remotely relatively recently, the Parabol team has had a few reps under its belt and it’s great to learn from their experience.
Jordan is an incredibly sharp thinker, so I’d highly recommend reading his post in its entirety to fully benefit from his deep observations. Below, I’d only outline Prabol’s hiring process at a high level and offer my perspective on it.
The application process is extremely lightweight: contact info, work eligibility, and relevant materials that the candidate thinks attest well to their fit for the role. Note that a resume is not required (but is an option), which I’m a big fan of as they’re often bad predictors of fit. The one tweak I’d offer here would be to offer a fast-track option (quicker application review time) that requires completing a short assignment, demonstrating deeper interest from the candidate.
2. Optional pre-screen
Throughout the process, there’s an intentional effort to not waste either the candidate’s or the team’s time and this is a good example of that. The outcome of reviewing the application doesn’t have to be a definitive pass/fail. If the outcome of the review is inconclusive, the team simply emails the candidate asking a specific question or requesting additional information, rather than forcing a definitive, suboptimal outcome — passing on a candidate who had a shot or wasting time with a borderline candidate.
3. Phone screen
A 30-min phone call (sometimes shorter) where the agenda is optimized to reject a candidate who’s not a fit as quickly as possible, by asking the biggest question first. Parabol’s “big question” is very straightforward:
Compared to your previous roles, what would you like to do more of and less of in your next role? And why does Parabol feel like a good fit for you?
However, it packs a lot of insights, allowing the team to get a rough assessment of the candidate’s self-awareness, motivation, alignment of interests, excitement about the opportunity, and level of verbal communication skills.
At the end of the screen, the baton for driving the process forward is passed to the candidate. If they’d like to move forward, they’re asked to send the team an email with any questions that they didn’t get answered today and want answered as part of upcoming conversations.
I LOVE this little tweak! Not only does it give the team a strong signal on the candidate’s level of interest in the role, and doesn’t waste their time with candidates that would just show up to the interview day because they were invited to one, it also, and perhaps more importantly, a deeply empathetic way to connect with the candidate, acknowledge that this is a two-way evaluation process, and in a small way, allows them to co-design the remainder of the process to fit their needs.
4. Skills assessment: 2 months, 2 weeks
A 30–60mins session in which candidates are asked to look critically at Parabol data and ask questions in order to create their own onboarding plan and scope out about 2 months of work.
Towards the end of the session, they are given a take-home assignment (that’s emailed back to the team once complete) in which they are asked to distill the plan down to:
The 3–5 things they’d like to get done in the first 2 months.
The 3–5 things they’d like to get done in the first 2 weeks.
I’m a big supporter of the overall approach of avoiding brain teasers and various whiteboarding exercises for assessing skills. However, there’s some nuance that’s not fully captured in the description of this step that may or may not cause it to introduce bias of its own.
Most of us are pretty bad in engaging with out-of-context hypothetical scenarios: thinking how we’d act in a situation we’ve never been in before, or how we’d solve a problem we’ve never solved before. This gets compounded if we have to do that “thinking on our feet” without time to fully digest the new situation and pattern-match it to a challenge we have been in before.
The “live” portion of the exercise outlined above runs that risk, though it can be mitigated by teeing up the conversation and sharing the data ahead of time. Recording the interview and making the recording available for the take-home assignment, as well as ensuring that follow-up questions are encouraged, can further mitigate some biases.
Personally, I’d still couple this exercise with a deep dive on a recent project that the candidate was involved with/led. Hearing the candidate truly in their element, speaking about something that they’re an expert on (their own experience) can be a good counterbalance for some of the challenges with the hypothetical exercise.
5. Cultural assessment
This is a 60-mins group session (a member from each team is present) aimed at assessing the candidate’s alignment with Parabol’s 3 core values: transparency, empathy, and experimentation.
The format uses “tell me about a time…” questions (“Can you think of a time when you last lost your cool?”) and follow-up questions to explore deeper (“if we were to ask the other person what their version of this story would be, what would it sound like?”).
The laser focus on values alignment, rather than broad and fuzzy “culture fit” is fantastic.
However, the method, as Jordan points out himself, is imperfect in ways that go beyond needing to be mindful that “absence of evidence is not evidence of absence”. “Tell me about a time” questions suffer from the same retrieval/out-of-context challenges as hypotheticals. I don’t keep a running list of times that I lost my cool in my head, and it may be difficult for me to think of one on the spot. Yet that has little to do with my actual alignment with the company’s values. “Tell me about a time” questions run the risk of assessing more for preparedness for answering the particular question than the essence of the response itself. An alternative approach will be similar to the one outlined in the previous section: asking broader experience questions and zooming in from there. For example: what did you like/dislike the most in your previous role? what were your greatest strengths/areas of growth in that role? what would your manager say if we asked him? what was your proudest achievement? They are not without fault of their own, but better than “tell me about a time” questions, in my opinion.
6. Contract-to-hire “batting practice”
Rather than forcing the team towards an expensively reversible “hire/don’t hire” decision, after the cultural assessment the team answers a different question, consistent with their experimentation/safe to try value:
Do we want to put some of our company’s money and more of our team’s time to try working alongside this candidate?
A 20-hour task is picked, often from the onboarding plan the candidate created in the skills assessment interview, and the candidate is extended a 2–4 weeks part-time contract to complete it, depending on their availability. At the end of the project, the candidate reviews the deliverable with the team and they conduct a shared retrospective, after which the team needs to make a unanimous decision on whether to extend a full-time offer to the candidate.
Conceptually, I’m a big supporter of this type of contract-to-hire assessment as a way to give both parties a better feel for what it would be like to work together. Practically, it can be a challenging commitment for many candidates with existing full-time jobs and family obligations.
My only other hope is that the team is embodying the “safe enough to try” value in their final decision as well, looking for consent, rather than consensus on that final decision.
Taking a step back, the Parabol process is a great blueprint for a highly effective remote hiring process. I’ve outlined the tweaks that I’d make to make it even better. You should consider your own. The only big thing that I would have liked to see more of, is carving out more time for the candidate to assess the company, not just for the company to assess the candidate. While I didn’t see it listed in the post, one way to go about it is still credited in my head to Jordan: have one of the interviews be an interview where the candidate explicitly interviews an employee in the company rather than the other way around. I’ll let this be my parting thought for this post.