Decoupling performance and development

It’s time for another foray into this enjoyable minefield. I’ve been exploring the interesting space of performance and development over multiple posts over the years. More recently, I’ve been noodling on its deeper, philosophical aspects, grappling with questions such as what is performance? is the notion of individual performance still relevant in organizations where work is highly collaborative and deeply intertwined? and what is required of someone to be able to fairly assess someone else’s performance? 

But today, I’m leaving these deep questions aside and taking a more incremental step, hopefully in the right direction.

Performance reviews/evaluations have been drawing a lot of fire in recent years. The key design flaw in traditional performance review cycles is their tendency to mix up pieces that pertain to a person’s performance and pieces that pertain to their development. The NeuroLeadership Institute has been making a pretty compelling case for why this is a major issue. In a nutshell, there is an inherent trade-off between accurately evaluating performance and accelerating development. When we try to do both in the same program we do a sub-par job on both.

Yet while the push for change is justified, many organizations have reacted in rather naive ways: from paying lip service and rebranding the same program as “development”, through eliminating the program altogether to fully replacing it with a developmental program (often a series of coaching conversations). 

These responses tend to ignore an important truth: that while development is growing in importance and needs to be deliberately designed for, the original reason for introducing performance evaluations still exists — the need to fairly allocate compensation. Steven Sinofsky, Lori Goler, Janelle Galle, and Adam Grant seem to agree. 

And by fairly allocating compensation, I’m not talking about bonuses or other forms of variable pay which I’m strongly opposed to. I’m talking about the ongoing need for proportionality between a person’s contribution to the org and their compensation, from the extremes (termination, promotion) to the mundane (merit increases). 

Obviously paying lip service doesn’t solve anything, but so does the alternatives. Eliminating the program creates a vacuum that’ll most likely lead to a less-fair emergent solution. And a development program will do a terrible job driving fair compensation, just like an evaluation program will do a terrible job driving development. 

BOTH performance management AND professional development are critical organizational needs. But they need to be addressed separately using two different programs with some key design differences.

Performance Management Program/Process

  • Purpose: procedural justice in compensation allocation 
  • Driven by the evaluator(s)
  • Feedback is absolute: comparison against an existing bar/benchmark. Ideally, binary (yes/no rather than ratings) evaluation against job level rubrics that span domain mastery, collaborative ability, and company values
  • Evaluates work in the context of the current role

Professional Development Program/Process

  • Purpose: overcoming our human “present bias” (by creating the space for reflecting on the past and envisioning the future)
  • Driven by the individual motivated to develop
  • Feedback is relative: ideally stack-ranking the dimensions/components of performance from the one to focus the most on to the one to focus the least on
  • Considers whether the current role is the best container for development (vs. taking on add’l responsibilities / new role / different company / etc.)

The two programs are mostly decoupled from one another with one caveat: some outcomes of the performance management process constrain the possible pathways for development.

While we should keep looking for better designs to manifest the purpose of each program, we need to keep in mind that both are critical. 

Decoupling performance and development

Meeting Modes [da Silva and Bastos]


Paraphrasing the first paragraph of one of my still all-time favorite self-authored posts: the essence of every organization is a synergetic collaborative effort. We deliberately organize because we can create together something better than the sum of what we can separately create on our own. 

Yet “collaboration” is a pretty fuzzy term, so designing structures in support of collaboration requires a more detailed taxonomy, which allows us to decompose collaboration into its respective parts, or modes if you will. I’ve been searching for such MECE taxonomy for quite a while and was delighted to come across Davi Gabriel da Silva and Rodrigo Bastos’ work, which comes pretty close to the goal: 

O2: Organic Organizations— Open-source practices for self-management

While wrapped-up in progressive/”teal”/self-manage-y context, the section about meeting modes stands on its own and its conceptual applicability is not dependent on how “progressive” the organization is. Especially if we take a step back and realize that “meeting” is a label we use to describe a collaborative interaction, so “meeting modes” are synonymous with “collaboration modes”. Da Silva and Bastos identify 5 key modes: 

Review Work

Often times also referred to as “retrospective” and aimed at building a shared understanding of where we stand. 

Sync Efforts

Making peer-to-peer requests to provide information, deliverables or help is an essential part of collaborative efforts. 

Adapt Structure

Since collaboration takes place in a dynamic environment, there needs to be a mechanism for changing the way responsibilities are divided to best meet the changing conditions.

Select People

These dynamic “containers of responsibilities”, need to be dynamically filled by individuals as both the needs of the group and the needs of the individuals change. 

Care for Relationships

A collaborative effort carried out by humans needs to account for our humanity. This mode aims to develop communication, recognize individual needs and nurture openness among collaborators.

In Target Teal’s open-source “pattern library” you can also find more specific examples for how to facilitate each one of the collaboration modes. 

Meeting Modes [da Silva and Bastos]

Service-as-a-Benefit (SaaB)

Breaking the zero-sum game

Over the past few years, the employer benefits ecosystem experienced exponential growth. The inventory of available benefits goes far beyond the core, non-taxable set of medical insurance, 401(k)s and commuter benefits to include a wide range of additional services. Childcare, financial planning, physical therapy, coaching, fertility treatments, and many more can now be offered as an employer-sponsored benefit. 

This trend of packaging a consumer service as an employer benefit, Service-as-a-Benefit, or SaaB for short (corresponding to Software-as-a-Service/SaaS) is fueled by tailwinds on both sides of the marketplace. As the competition for top talent continues to heat up, employers look at unique benefits as a way to tap into employees’ mental accounting, differentiate their employee value proposition and bring their corporate values to life. On the provider side, new entrants, in particular, are looking for effective growth strategies and are drawn to the allure of the corporate channel and its promise of acquiring a large group of users, en masse, while reducing overall acquisition costs, and securing a more reliable source of revenue. 

Hitting the “Business Case” wall

But what first seems like a simple win-win as providers see good traction with enthusiasts and early adopters, reveals its more complex nature when providers try to establish a stronger foothold in the market. 

Benefit administrators without a strong affinity to a particular service, find themselves between a rock and a hard place. On the one hand, a growing abundance of services to choose from. On the other hand, a very heterogeneous value proposition to their employees: while parents may find childcare, for example, absolutely essential, childless employees will find it useless. The important and deliberate investment in building more diverse workforces and growing organizational geographical footprints compound the latter even further, as a more diverse and global workforce has a more diverse set of employee needs.

Given a fixed per-employee benefits budget, provider selection becomes a zero-sum game: choosing to offer coaching-as-a-benefit means not offering financial-planning-as-a-benefit. And how does one supposed to compare coaching to financial planning, especially taking the heterogeneity in value into account? 

Within this paradigm, the only way to break out of the zero-sum game is by making the business case for an overall increase in the benefits budget given the intrinsic value of a particular service. And that business case often proves out to be incredibly difficult to credibly make. 

The speedy early traction grinds to a crawl, if not a complete halt.  

Back to Win-Win(-Win)

There is, however, another path for turning the zero-sum game into a win-win(-win) by redesigning the way benefits management works across providers, administrators, and employees. 

Service-as-a-benefit providers need to shift from lump-sum per-org-size or per-employee pricing schemes to per-activated-employee or per-usage pricing schemes, so they only get paid when an employee has opted to use the service.

Benefits administrators need to both curate a portfolio of service-as-benefits providers that’s strategically aligned with their intended positioning, and provide their employee base with transparent individual budgets to allocate across the portfolio. The preliminary curation is essential to both preventing erosion in mental accounting and the perceived value of the benefits, as well as avoiding an unreasonable cognitive load on employees forced to choose from a nearly endless array of options. 

Employees will then be responsible for constructing a benefits package that best suits their needs, and will have the option of modifying the package on a reasonable cadence (quarterly seems reasonable) to reflect any changes in their personal needs and life circumstances.  

This reconfiguring of the ecosystem also poses an interesting business opportunity in the form of a platform for bringing all three parties together and potentially providing the following services: 

  • Enable providers to easily interact with a large group of benefits administrators and streamline the handling of both contracting and payments.
  • Enable benefits administrators on one side to interact with a large group of providers and easily create their service-as-a-benefit portfolios, and on the other side to set individual benefits budgets. 
  • Enable employees to manage their personal benefits budgets, build and modify their benefits package and onboard onto/enroll in the specific service-as-a-benefit that they selected. 

Now all we need is that platform…

Service-as-a-Benefit (SaaB)

The consequences of over-simplification

We’re going deep into science today, fasten your seatbelts. 

I came across a couple of really interesting articles recently that call to question our assumptions about how our world works in deeply profound ways: 

How ergodicity reimagines economics for the benefit of us all by Mark Buchanan

The Flawed Reasoning Behind the Replication Crisis by Aubrey Clayton

While the former looks at decision-making, the latter looks at statistical analysis. They both challenge the assumptions which underlie the common methods. And in both cases, it’s an error of omission, ignoring some data about the real-world, which is the culprit that causes the method to yield sub-optimal, if not completely flawed, conclusions. 

Ergodicity Economics 

Ergodicity Economics highlights the challenges with the assumption that people use an “expected utility” strategy when making decisions under conditions of uncertainty. 

The expected utility strategy posits that given a choice between several options, people should choose the option with the highest expected utility, calculated by multiplying the probability of the scenario by the value of that outcome in the scenario. 

Bayes’ Rule

The challenge with this strategy is that it ignores an important aspect of real life — time. Or more specifically, the fact that life is a sequence of decisions, so each decision is not made in isolation, it takes into account the consequences of the decisions that were already made and the potential consequences of the decisions that will be made in the future. 

This has some profound implications for cooperation and competition and the conditions under which they are beneficial strategies. Expected utility suggests that people or businesses should cooperate only if, by working together, they can do better than by working alone. For example, if the different parties have complementary skills or resources. Without the potential of a beneficial exchange, it would make no sense for the party with more resources to share or pool them together with the party who has less. 

But when we expand the lens to look not just at a single point in time but a period of time in which a series of risky activities must be taken, the optimal strategy changes. Pooling resources provides all parties with a kind of insurance policy protecting them against occasional poor outcomes of the risks they face. If a number of parties face independent risks, it is highly unlikely that all will experience bad outcomes at the same time. By pooling resources, those who do can be aided by others who don’t. Cooperation can be thought of as a “risk diversification” strategy that, mathematically at least, grows the wealth of all parties. Even those with more resources do better by cooperating with those who have less. 

Bayesian Inference

Consider the following story (paraphrased from Clayton’s piece): 

A woman notices a suspicious lump in her breast and goes in for a mammogram. The report comes back that the lump is malignant. She needs to make a decision on whether to undergo the painful, exhausting and expensive cancer treatment and therefore wants to know the chance of the diagnosis being wrong. Her doctor answers that these scans would find nearly 100% of true cancers and would only misidentify a benign lump as cancer about 5% of the time. Given the relatively low probability of a false positive (5%), she decides to undergo the treatment. 

While the story seems relatively straight forward, it ignores an important piece of data: the overall likelihood that a discovered lump will be cancerous, regardless of whether a mammogram was taken. If we assume, for example, that about 99% of the time a similar patient finds a lump it turns out to be benign, how would that impact her decision? 

This is where Bayes’ Rule comes to our rescue: 

We’re trying to find P(A|B) which in our case is P(benign|positive result)

P(A) = P(benign) = 99% (the new data we just added), and therefore P(malignant) = 1 — P(benign) = 1%

P(B|A) = P(positive result|benign) = 5% the false positive stat the doctor quoted.

The doctor also told us that P(positive result|malignant) = 100% 

Which then helps us find P(B) = P(positive result) since we need to decompose it to: P(benign|positive result)*P(benign) + P(malignant|positive result)*P(malignant).

Not we can plug everything into Bayes’ rule to find that:

P(benign|positive result) = (0.05*0.99)/(0.05*0.99+1*0.01) = approx 83%

So the likelihood of a false positive is 16 times higher than what we thought it was. Would you still move forward with the treatment? 

Clayton’s case is that this “error of omission” in the analysis extends beyond mere life-and-death situations like the one described above and into the broader use of statistical significance as the sole method for drawing statistical conclusions from experiments. 

In Clayton’s view, this is one of the root causes of the replication crisis that the scientific community is now faced with and is beautifully illustrated by the following example: 

In 2012, Professor Norenzayan at UBC had 57 college students randomly assigned to two groups, each were asked to look at an image of a sculpture and then rate their belief in god on a scale of 1 to 100. The first group was asked to look at Rodin’s “The Thinker” and the second was asked to look at Myron’s “Discobolus”. Subjects who had been exposed to “The Thinker” reported a significantly lower mean God-belief score of 41.42 vs. the control group’s 61.55. Or a 33% reduction in the belief in God. The probability of observing a difference at least this large by chance alone was about 3 percent. So he and his coauthor concluded “The Thinker” had prompted their participants to think analytically and that “a novel visual prime that triggers analytic thinking also encouraged disbelief in God.”

According to the study, the results were about 12 times more probable under an assumption of an effect of the observed magnitude than they would have been under an assumption of pure chance.

Despite the highly surprising result (some may even say “ridiculous” or “crazy”), since they were “statistically significant” the paper was accepted for publication in Science. An attempt to replicate the same procedure and almost ten times as many participants, found no significant difference in God-belief between the two groups (62.78 vs. 58.82).

What if instead, we took a Bayesian approach and assumed a general likelihood of disbelief in God by watching Rodin’s “The Thinker” of 0.1% (and therefore a corresponding “no change in beliefs” of 99.9%), and then figure out what is P(disbelief|results)? 

We know from the study that P(results|no change) = 12*P(results|disbelief). Plugging this to Bayes’ Rule we get: 

P(disbelief|results) = 12 * 0.001 / (12 * 0.001 + 1 * 0.999) = 0.012 / 1.011 = approximately 1.2% a far cry from the originally stated 33% reduction…

The consequences of over-simplification

Tooling Collaboration [Kwok]


A friend shared this fascinating read by Kevin Kwok

The Arc of Collaboration

It’s a thoughtful thesis on the state of enterprise/business tools and the apparent tension between productivity and collaboration, as epitomized by the use of Slack. 

Starting with Slack as the jumping-off point, Kwok acutely observes that, currently, Slack usually serves three main functions: 

1. “Else” statement. Slack is the exception handler, when specific productivity apps don’t have a way to handle something. This should decrease in usefulness, as the apps build in handling of these use cases, and the companies build up internal processes.

2. Watercooler. Slack is a social hub for co-workers. This is very important, and full of gifs.

3. Meta-coordination. Slack is the best place for meta-levels of strategy and coordination that don’t have specific productivity apps. This is really a type of ‘else statement’, but one that could persist for a while in unstructured format.

The first one is worth digging into. In essence, the prominence of Slack is a result of existing business tools not supporting some essential collaboration capabilities. Slack fills these collaboration gaps (as well as supports 2&3 above). To use Kowk’s eloquent yet slightly hyperbolic metaphor: 

Slack is not air traffic control that coordinates everything. It’s 911 for when everything falls apart.

And we are seeing some of those functional gaps starting to get closed, by native, in-app capabilities. All “GitHub for X” collaboration-first products fall into that category. Kwok highlights Figma (Design), to which I’d also add Abstract (Design) and GitBook (Docs) to show that this more than a single-product trend. The ripples in the collaboration pond originated in engineering (GitHub) and permeate outward starting with engineering-adjacent roles (designers, technical writers) with the first ripples reaching as far out as HR

But there seems to be a huge missed opportunity here, since the need for collaboration is function/work-product agnostic. By building in-app collaboration capabilities the walls between apps become higher and cross-app interoperability becomes harder. Not to mention the increased cognitive load on the user. 

Kwok looks at Discord as a different potential direction/inspiration for a tool that provides a set of collaboration capabilities (text and voice chat) across a set of functional workflows (games), stemming from a similar unfulfilled gap around collaboration caused by the poor in-game chat/collaboration capabilities. While the analogy has its limits, it does highlight an exciting path forward. 

The depth of desired collaboration on an MMORPG game and a business app (if you need a mental image: think a PPT/Keynote/Gslides presentation) is different, while the former often stops at coordination, in the latter we aspire for co-creation. This has a couple of concrete implications: 

1. Functionality-wise, lofty “collaboration” can be decomposed into a set of shared cross-product capabilities:

This list is probably incomplete, but the immediate capabilities that come to mind are: 

  • Unique user identity — mostly already solved today via “login-with-Google” type solutions. 
  • Hyper-granular permissions — providing edit/comment/view access to a subset of the work product. Think a single slide in a deck or a set of rows in a spreadsheet. 
  • Synchronous and asynchronous co-creation — I’m making a distinction here between “two people writing in a GDoc at the same time” and “Person A using the “suggest” feature to offer specific changes to Person B”. Both are needed. Both are relevant across all work products. 
  • Versioning —used here as a catch-all for the entire GitHub functionality including but not limited to: fork, branch, pull request, diff, and partial/selective merge. 
  • In-line commenting —having a dialogue about a specific piece of the work-product
  • Decision-making — decision-making is used here in the narrow context of making decisions about changes to the work-product, not in the broad context of making business decisions facilitated by the work product. Today, only a single decision-making mode is supported, by some of the products: a single autocratic “owner” accepting changes made by “collaborators”. 

2. An ideal solution will seamlessly integrate with the existing functional UIs

The breadth of work-products that we want to collaborate on justify the existence of multiple “creation/authoring” UIs. Any UI that will try to reduce collaboration on code snippets, presentations, and spreadsheets, just to name a few various work-products, into a single UI will most likely have to do so at the expense of depth of collaboration that it can support. An ideal solution will look more like Grammarly’s browser extension, which provides cross-product writing quality support, or like a “collaboration SDK” which multiple products can adopt. 

Tooling Collaboration [Kwok]

The score takes care of itself [Walsh/Rekhi]

Continuing the accountability/collaboration arc of the past few months, today I’m building on Sachin Rekhi’s insights from reading Bill Walsh’s The Score Takes Care of Itself.

A key motivation for implementing a traditional organizational goal-setting/OKR system is often to use it as a mechanism to drive accountability. Therefore, the push-back for eliminating or overhauling such system is often the concern that if we don’t ask people to set goals, what is it that we will hold them accountable for? 

Walsh, via Rekhi, offers a compelling alternative: a Standard of Performance.

The Standard of Performance clearly delineates what excellence looks like in each role. It includes each of the skills that someone who is excelling at the role is expected to have. And even beyond job skills, it includes the attitude that’s expected of each individual as well as interpersonal dynamics.

In a sense, I’m thinking about it as an expanded/modified career ladder that focuses on those three elements: skills, attitude, and interpersonal. Under such system, people are held accountable to the behaviors that eventually lead to long-term success, rather than to a defined outcome with a fixed time horizon

The one aspect where my views differ from what Walsh/Rekhi advocate for is around who defines these Standards of Performance for each. In their opinion, it should be the coach/manager who does that, but they also acknowledge that this approach poses a real challenge: 

Bill expects leaders to be functional experts in the roles on their team in order to develop the Standard of Performance. These leaders are not just people managers. They are the very best at what they do… But Bill admits that to do this well, you need to posses incredible knowledge and develop expert intuition in your domains of expertise. And this takes a lifetime of experience to hone and develop. There are no shortcuts in Bill’s leadership approach.

In my opinion, this falls into the “unicorn manager trap”: so we’re looking for people who are not only highly competent and motivated people managers, but they’re also functional domain experts capable of defining the standard of performance in each of the roles that report into them? Good luck finding them…

The good news is that I don’t think this is a hard requirement, and an alternative can actually move us on the path of unbundling the managerial responsibilities package: the skills piece of the Standard of Performance is a function, well, of the function (or role). I believe it should be identical for people doing the same role in different organizations and there’s no need for each organization to reinvent the wheel here. I do expect inter-org (but not intra-org) variability in the attitude and interpersonal pieces since those should be reflecting the company values/culture. However, those should be defined at the company (not team) level and co-created in a participatory process. 

Is this the end-all-be-all solution for humanistic accountability? No. But certainly a piece of the puzzle. 

The score takes care of itself [Walsh/Rekhi]

The Silent Meeting Manifesto [Gasca]

Continuing another arc that I’ve previously explored here in previous posts, first in 2014 and more recently, at the end of last year, I recently came across David Gasca’s The Silent Meeting Manifesto

The get the meta nerd-out out of the way first, this arc is a really cool example of the evolution/adoption of an organizational practice: from “here’s this unusual practice that Amazon uses and seems to be working really well for them” (2014) through “we’ve tweaked and adopted this practice in our company and it seems to be working well for us as well” (2018) to “here’s the manual/playbook for how to implement this practice in your company” (now). 

Since Gasca’s post is a bit verbose (Medium estimates the reading time at 26mins) here’s a quick summary/teaser that’ll hopefully convince you to commit the time to read the full thing: 

Silent meetings aim to address these 10 challenges with the traditional (“loud”) meeting format: 

  1. No agenda
  2. No shared reading material for the whole group
  3. Unequal time-sharing
  4. Bad presentations — too slow, too fast, and meandering
  5. Most meeting attendees don’t comment
  6. Reading is faster than listening
  7. Favors native speakers
  8. Bad for remote attendees
  9. Rambling questions
  10. Comments from the meeting often get lost and aren’t captured in any doc

Silent meetings address these challenges by making 4 key changes to the way meetings are typically run: 

  1. Ahead of the meeting, create a basic agenda that defines the meeting goals, non-goals and format and explicitly appoints a meeting facilitator and a meeting note-taker (two different people). 
  2. Create a “table read” document, that will be read during the meeting in order to provide all participant with the shared context needed to accomplish the meeting’s goals. This differs from a “pre-read” in that there’s no expectation that the document will be read ahead of the meeting. 
  3. Read and comment on the “table read” — this is done during the meeting. Participants read the doc and post comments and questions in the doc. Then they read other participants’ comments and questions. The facilitator monitors the process and tags specific individuals that are best equipped to answer particular questions. 
  4. Facilitator synthesizes comments and leads discussion — identifying the themes in the comments, the facilitator triages them and leads a discussion on the themes that will make the best use of the attendees’ time. 

The silent meeting format has its boundaries. It will not work well the meeting aims to address interpersonal dynamics, be an inspirational talk, or cover a broad/multi-issue agenda. It also doesn’t address more systemic meeting issues such as having a true need for a meeting, to begin with, the right attendees, and a clear decision-making process.

A good “Table Read” is a vertical document (Word/Gdoc) rather than a horizontal one (PPT/Keynote), often covering the following topics: 

  • Meeting agenda
  • Problem/Situation background
  • Solution principles/parameters
  • Options identified that can solve the problem
  • Recommendation
  • Discussion questions
  • FAQs
  • Appendix/ add’l info

Lastly, a silent meeting is not without pitfalls, particularly around the key interventions/format changes that it introduces: bad facilitation, low-quality table read, or ineffective handling of comments (including lack of follow-through) will likely lead to a silent meeting not accomplishing its goals. 

The Silent Meeting Manifesto [Gasca]