Should you be using AI for performance reviews?
During the last decade, digital innovations have produced a range of recruitment and evaluation tools: now, whenever you first apply for a job, you are less likely to be judged by humans and more likely to be assessed by AI. Before you can even get the opportunity to impress a human interviewer, you will first need to impress the algorithm!
More recently, AI has also been used to assist current employees in doing their jobs and then to help their employers evaluate how well employees are performing in those jobs. In fact, AI adoption is now the norm across knowledge economy jobs, with estimates indicating that at least 70% of people use AI regularly at work (a figure that is probably an underrepresentation of the reality, since much of AI use at work is clandestine and undisclosed) and an increasing number of organizations are using AI to evaluate employee performance.
Meritocratic or Orwellian?
Traditional performance evaluations (often an onerous, annual ritual based on subjective, “noisy,” and unreliable or invalid manager feedback) are indeed being disrupted by algorithms capable of analyzing workflows, communication patterns, and even “relational analytics” (mining the digital footprints of your exchanges with coworkers) in real-time, which critics lament as a form of “surveillance capitalism.”
To be sure, these tools put unprecedented power in the hands of organizations to pursue data-driven management decisions which, at their best, can make workplaces fairer and more meritocratic, but at their worst, seem uncomfortably close to an Orwellian big brother dystopia and can erode trust and morale.
To make sense of AI in performance management, it helps to imagine a simple matrix with four quadrants or scenarios, which echoes the classic negotiation model by Roger Fisher and William Ury on win-win outcomes, as well as decades of behavioral science differentiating integrative from zero-sum approaches to conflict. In one scenario, the company and the employee both win. In another, only the company wins. In a third, employees learn to game the system to their benefit but not to the company’s. And in the worst case, nobody benefits at all.
First scenario: AI helps both the company and the employee. Let’s start with the best quadrant of the matrix. Used well, AI can make feedback fairer and more useful. Anyone who has ever received a vague appraisal knows the problem, and meta-analytic studies show that only 1/3 of feedback is typically useful, 1/3 is useless or irrelevant, and 1/3 actually worsens employees’ performance! Add to this the typical unreliability of performance evaluations, which are usually highly subjective: one manager loves your enthusiasm; another thinks you talk too much; a third simply remembers the last mistake you made; a fourth has no idea who you are, and so on. In other words, performance evaluation has historically been closer to subjective wine tasting than to objective science.
AI, if properly used and validated, can anchor feedback in observable behavior rather than impressions. A sales manager might see which client interactions actually led to repeat business on her sales team. A project leader might learn that delays happen when approvals pile up on his desk. Instead of impatiently waiting for an annual review to learn how their performance may be perceived, employees get real-time feedback and suggestions. The process becomes closer to coaching than judging. This is where the promise of AI is most compelling. It democratizes the collection and distribution of feedback and suggestions. It replaces guesswork with data. It never forgets and it can make employee evaluation performance-driven rather than political.
Second scenario: AI helps the company but harms employees. The same tools can quickly slide into surveillance. Algorithms now analyze workflow, communication patterns, tone of voice, and even what some vendors call “relational analytics.” A decrease in typing speed may be interpreted as disengagement. A change in Slack sentiment might flag someone as “skeptical” or “cynical.” Tracking and penalizing irregular working hours could covertly disadvantage parents or people with health issues. Voice or facial analysis might infer emotional states or physical conditions that employers may actually be legally prohibited from determining or diagnosing. What begins as an effort to measure performance may become a digital panopticon. Employees feel watched rather than supported. Trust erodes over the long term, even if productivity appears to rise in the short term. As is often the case, European countries are among the first to provide legislative protections (https://natlawreview.com/article/ai-news-italy-sets-rules-ai-workplace) for employees in order to safeguard against this scenario.
Third scenario: Employees benefit but the company loses. People are not passive. When employees realize they are being judged by an algorithm, they learn to reverse-engineer it to their benefit. Anyone who has worked in a call center (or even just called in to one) has seen this dynamic. If the AI rewards a cheerful tone, everyone becomes artificially upbeat even if doing so creeps out callers with the service reps’ saccharine affect. If AI rewards high email volume, outboxes and inboxes fill up with unnecessary messages.Teachers teach to the test. Students memorize without understanding. In offices, people optimize for metrics instead of outcomes. Real collaboration moves to private channels, and official data becomes less truthful than before. AI ends up measuring performative theatre rather than real value added, and employees learn to create perfect performance reviews and fake productivity signals leveraging AI to deceive or fool employers, taking progress back decades.
Fourth scenario: Nobody benefits. The worst outcome is multilateral mistrust. Managers hide behind dashboards that they cannot explain. Employees treat feedback as noise. Performance reviews become bureaucratic “check the box” exercises completed with minimal attention. “We pretend to work, and they pretend to pay us” was a cynical workers’ slogan decades ago in the Soviet Union. Perhaps “We pretend to evaluate our performance and they pretend to evaluate us” could be the contemporary equivalent when appraisals are essentially “AI Slop.” When a manager says, “the system gave you this rating,” leadership has effectively abdicated responsibility. Organizations can collect terabytes of data that predict nothing useful. Employees disengage. Trust and morale decline. We have seen versions of this before with poorly designed assessments or unvalidated tools. Technology does not eliminate bad management. It can scale it. And in this scenario, even as “the system” remembers anything that is fed into it, managers and employees quickly disregard and forget everything that comes out of it.
What to do
What, then, should leaders do? The principles are simple, though not easy. Validate before you automate. Ask whether a metric predicts real performance or just activity. Be transparent about what data is used, how and why. Ensure that the system can be audited for how it maps inputs into outputs and is not an inscrutable “black box.” Keep humans in the loop so context is not lost. Do not collect or consider private information, even if technology can infer it. And don’t just optimize for operational metrics or output, but also for morale and engagement. And finally, let the AI provide feedback not just to employees, but also to managers and HR about what can be done to create the foundation for enhanced employee success in the future.
During the last decade, as algorithms and AI have become central to talent decisions and as estimates suggest that the overwhelming majority of people use AI at work, the temptation to measure everything has grown. As the line often attributed to Einstein reminds us, not everything that counts can be counted, and not everything that can be counted should count. AI can make performance management either more like good coaching or more like constant surveillance. The difference lies not in the technology, but in how wisely managers, employees and organizations choose to use it. AI should not just be used to evaluate employees within an organizational system, it should also evaluate the system in which the employees are working and come up with constructive observations and recommendations that can enhance individual, team, departmental and company success.
Importantly, there is still much to preserve from the art of good performance appraisals, which long predate AI and often work precisely because they are human. When a manager and employee co-create clear, measurable goals at the start of the year, everyone gains clarity about what success looks like (and makes a cognitive and emotional investment in achieving that success) and fewer surprises or disappointments emerge later. When feedback is specific, timely, and anchored in real achievements or failures, such as a difficult client negotiation, a failed product launch, or a junior colleague you coached getting promoted, employees learn what to repeat and what to fix, and managers see capability rather than just output. And when reviews include a forward-looking development plan, perhaps rotating someone into a new market, funding an additional training program, or pairing them with a mentor, the organization invests in future value while the employee sees a credible path for growth. These simple practices succeed not because they are high tech, but because they align incentives, create shared holistic understanding, and turn managers into competent people-leaders. When used properly, AI can enhance and accelerate the successful co-evolution of systems and all of their stakeholders.