Build a Performance Review System Your Team Trusts

Host:
Nahed Khairallah

Today’s episode is based on a question I received from one of our listeners who asked to remain anonymous, and it is a good question:

"How do you set up a great performance review? I am especially curious about the science of peer feedback as part of a 360. From what I have read, it is not helpful and can be counterproductive. But if that is the case, how do you encourage a culture of feedback in a proactive and positive way?"

There are 2 questions in there. How do you build a review that works? And what do you do about peer feedback if peer feedback is broken?

Here is where I want to start. I have built and rebuilt performance systems for around 50 startups. I have been part of 360 feedback reviews that turn into a polite round of compliments that nobody believes or gets any value from. I have also seen 360 reviews that fundamentally changed how someone worked.

Peer feedback can be very effective. Whether it comes down to how you build it.

And the research behind the listener's question shows that peer feedback is conditional. It works when a specific set of conditions are in place, yet most companies never build those conditions. So they get the bad version, and then they write off the whole idea.

That is what this episode is about. What makes a review work, why most of them fail, where peer feedback and 360s actually fit, and how to build the feedback culture that has to exist underneath all of it.

What Performance Management Is For

Most companies start with the review template. They download some template, they tweak it to their liking, then they roll it out. That's the wrong place to start.

Performance management exists to answer 3 questions:

  1. Are we hitting the goals we set as a company?
  2. Are the people in their roles contributing to those goals?
  3. Are people showing the behaviors that let us win, given who we are and how we operate?

If your review process doesn't help you answer those 3 questions, it just becomes an administrative ritual.

Hold onto those questions, because every failure I'm about to walk through is due to a review process that lost track of one of them.

Start by Defining Good and Bad

The first real part of a working performance system is also the part companies skip most often. You have to define, in advance, what good performance looks like, what unacceptable performance looks like, and the range in between.

And you have to tie those definitions to things people can see or measure. That means observable behaviors, concrete outcomes, and numbers that aren't up for debate.

Where companies go wrong is they communicate vague performance standards, and then they get specific only at review time, rely on what’s in the manager's head, and put the whole process under time pressure. That's not how to run an effective performance review.

The research on this topic is some of the most settled in all of organizational psychology. Edwin Locke and Gary Latham spent 35 years on goal-setting, over roughly 400 studies, and the finding holds up about 90% of the time. Specific goals that are challenging produce better performance than vague goals. They provide standards, they cut ambiguity, and people know what they're aiming at.

Now, we can’t talk about performance reviews without discussing the objective versus subjective issue.

You can't remove all judgment from a performance review. Pure objectivity doesn't exist in one, and pure metrics get gamed and miss the collaborative work that never shows up in a number.

Trust breaks when the judgment behind a rating is vague and inconsistent. You solve this by tying the performance standard to behaviors that people can observe and outcomes you can measure. There's a formal tool for this called a behaviorally anchored rating scale, or BARS for short. The purpose of a BARS scale is to anchor manager judgment to a consistent behavioral standard that can be observed on the job.

Let’s take a customer support rep as an example. The ineffective version of a rating is a 1-to-5 scale next to a word like "responsiveness," where a 3 from one manager and a 4 from another don't mean the same thing, and the rep has no idea what their rating means and what it would take to move them up. The BARS version anchors the rating in behavior. For example, a 5 looks something like this: the rep acknowledges new tickets within 30 minutes during business hours, resolves or escalates 90% of tier-1 tickets the same business day, and writes handoff notes the next rep can pick up without any additional information. A 3 could look like this: the rep acknowledges new tickets within 2 hours, resolves within 24 hours, and leaves notes that need some clarification. A 1 could look like this: tickets sit unacknowledged for more than 4 hours, and the next rep has to come find you to understand the context. Every level now points to something specific the manager and the rep can both see, and the rep knows exactly what their rating means and why they received it.

At a startup you don't need a complex BARS system. You need 3 or 4 written examples of what the behavior looks like at each level of performance, specifically for the thing that matters in the role.

When people trust the standard and trust the outcome, then your performance system has value. Otherwise, you get the annual song and dance ritual that you see at most companies.

Why Most Performance Reviews Fail

Let me walk through the 3 main reasons why most reviews fail.

The first one is that a large share of what ends up in a performance rating is about the manager’s preferences and not the employee’s performance. Research by Scullen, Mount and Goff is widely cited for this finding: if you took the same employee and had 10 different managers rate them, you'd get 10 different scores, and well over half of the gap between those scores would come from the managers' own subjective preferences, not from the employee's actual performance. For example, Manager 1 is a tough grader who rates everyone low. Manager 2 weights communication heavily because that's what they personally care about. Manager 3 is generous if you say yes in meetings, and Manager 4 is generous if you deliver work quickly. In short, the review score is telling you more about what the manager values rather than about the employee’s impact.

The second one is recency. Without a written record across the whole period, a manager remembers the last few weeks clearly and the first 9 months barely at all. The review is skewed before the process even starts. This is an area where I think AI is going to have the biggest positive impact on reviews. Humans are incapable of recalling detailed events over several months, so I think that having a tool that can compile all the relevant data across the review period and present to reviewers in a digestible format will eliminate this problem over time.

The third one is assuming that any feedback is helpful. Two researchers, Avraham Kluger and Angelo DeNisi, looked at the results of more than 600 studies on feedback, covering more than 23,000 people in total. They found that feedback helped on average, but more than a third of the time, it made performance worse. When you look deeper, it turns out that feedback about the work tends to help, but feedback about the person, things like their ego or personality, tends to backfire.

According to a study by Gallup, only 14% of employees strongly agree that their performance reviews inspire them to improve. That means the other 86% are going through the motions and don’t trust that there’s any benefits to the review process.

Adobe is the example I'd point a founder to. Before 2012, Adobe was spending more than 80,000 hours a year of manager time on annual reviews. They scrapped the annual review for lighter, more frequent check-ins, and they reported voluntary turnover dropped by about 30%.

It Starts With Leadership

Every failure I just listed comes from the same place: a missing feedback culture. So I'll back up a step.

You can't bolt a review system onto a company that has no feedback culture underneath it. The culture comes first or else no performance management system will work as intended.

Culture starts with leadership. Specifically, it starts with leaders holding themselves to the same standard they hold everyone else to.

Amy Edmondson's research on psychological safety explains how all this connects together. When people feel safe at work, they do more of the things that make teams better, like asking for help, admitting mistakes, and flagging problems early. And teams that do those things perform better.

There's a part of her research that matters here and often gets overlooked. Safety without high standards just makes people coast. The best performance shows up when you have both: high standards and a place where people feel safe to speak up.

In practice, this is when you have leaders asking for feedback from people at every level, including the hard, critical stuff, and then visibly taking it seriously and changing things in response. The first time a leadership team does that for real, the way the whole company gives and receives feedback changes.

And how often you do it matters more than the formal review. Gallup measured this too in the study I mentioned earlier. Employees who hear feedback from their manager every day are 3.6 times more likely to feel motivated to do great work. Weekly feedback makes people 5.2 times more likely to say the feedback they're getting is meaningful. An annual review can't do that. You don’t need to give direct reports daily feedback to build a performance culture, but you should give it frequently enough for it to make a difference. I generally advise companies to implement a minimum quarterly feedback cadence, which means that every manager is expected to give useful feedback to each of their direct reports at least once per quarter.

Where 360s and Peer Feedback Fit

That brings me to the listener's question around the usefulness of peer feedback.

Here's my honest answer from experience that is also backed up by research. Peer and 360 feedback can work, but only under certain conditions. It's not harmful in every case. The strongest evidence on this comes from researchers named Smither, London, and Reilly. They reviewed a body of studies that tracked feedback over time. What they found is that getting feedback from multiple people does help people improve, but the improvement is small, and only shows up when certain conditions are in place. The person getting the feedback has to see a real reason to change, believe they can change, and take action on it. Without those things, not much happens.

There's a clear reason peer feedback so often turns into a round of compliments, or to put it more bluntly, is utterly useless. People soften their feedback for 2 reasons. They're afraid the person will know who said it. And they're afraid that being honest could cost a colleague a promotion, a raise, their standing, or their livelihood.

When people know they can be identified, and the feedback could affect someone's career, they go easy on each other. The feedback ends up softer than what people really think. Anonymous feedback reduces that softening. But it also takes away the back-and-forth conversation that makes feedback useful in the first place.

This is the single most important design choice for peer feedback. If you only take one thing from this episode, take this one. Use peer feedback to help someone get better at their job. Don’t use it to make decisions on pay or promotion. Pay and promotion decisions should be based on impact and results. Those are objective measures that should be rooted in goals and results.

Fluffy peer feedback shows up when development feedback and pay decisions run through the same process. The moment a colleague knows their honest input could affect your raise or promotion, they protect you. Take the raise or promotion impact out of the conversation and the honesty comes back.

The second piece is the discussion. Harvard Business Review made this point in a January 2026 article. If you collect 360 feedback and hand someone a report, almost nothing happens. If you sit down and discuss the feedback with them in a structured conversation, that's where behavior changes.

The third piece is psychological safety. The safety we talked about earlier has to be in place before any of this works. Without it, none of the rest holds up. I want to take this a step further and say that peer feedback works best when there is a strong culture of psychological safety and the feedback is not anonymous. I say this because peer feedback without the context from a conversation is very surface level to result in actionable changes, or be convincing enough. When employees can engage with their reviewers and have an honest and transparent conversation that is built on good intent, then and only then can true development take place.

Let’s be real here; most employees are skeptical of 360 reviews and there’s research to back it up. SHRM has reported that most employees reject 360-degree reviews because of perceived bias. INSEAD has also published research that reached the same conclusion. Employees aren't wrong about what they've experienced. The bias they're describing comes from how most 360’s are built. They’re directly tied to pay and the results were never openly discussed with them.

The Trust Stack

Let me pull all of this together into something you can build. I call it the Trust Stack, because trust is what this whole episode comes down to. A review system works when people trust the standard you're measuring against, and they trust the outcome. The Trust Stack has 4 parts and you need to build them in order.

Part 1 is the feedback culture. Leaders have to give and ask for feedback themselves, including the critical stuff, before anyone else will. When they do it visibly and act on what they hear, feedback starts moving in every direction through the company. Every other part of the Trust Stack depends on this one. You can't skip it.

Part 2 is the standard. Write down what good performance looks like, what unacceptable looks like, and what falls in between. Tie it to behaviors people can see and outcomes you can measure. Get it on paper before the review cycle starts.

Part 3 is how often you do it. Give feedback against that standard on a regular cadence, not annually. Keep it short and lightweight. And keep a written record, so when the formal review comes around, it's not built from memory.

Part 4 is peer input, and it comes last. You only add it once the first 3 parts are working properly. When you do add it, follow the guidelines I talked about earlier: keep development separate from pay, and formalize the discussion requirement that makes the feedback useful and actionable.

Closing

So to wrap things up: The question to sit with is whether you've built the conditions that make peer feedback work in your company. Most companies don't build them. That's why most peer feedback disappoints.

Here's what to do this week. Pick 1 role on your team. Write down what good performance in that role looks like, and focus on behaviors you can see. Be specific. Describe what you'd see someone doing if you watched them work. That one exercise will tell you how much of your current review system is useful, and how much is just useless ritual.

The Startup

HR Strategy Canvas

Build an HR strategy that steers your company to the next level.
Download Free
Listen on
Alt text
Listen on
Apple Podcasts
Alt text
Listen on
Spotify
Alt text
Listen on
YouTube
Alt text
Listen on
YouTube Music
Alt text
Listen on
Amazon Music
Alt text
Listen on
RSS

Isn’t It Time You Organized Your Company’s Chaos?

From hiring, retaining, and promoting talent to compliance and managing exits gracefully, how you manage your people will be the difference between flatlining and success.
Whether you run through an HR Sprint or enroll in my Startup HR Operating System course, your company will be primed for growth and ready for any challenge.
Alt text
Nahed Khairallah
Organized Chaos