Cognitive surrender - how AI is reshaping professional judgement at work

By Mark Williams

New research finds that people using AI have high confidence in its answers, whether those answers are right or wrong.‍

What is cognitive surrender?

A recent paper from the Wharton School examines what happens to human reasoning when people work with AI. The dual-process model of cognition, popularised by Daniel Kahneman, distinguishes between two modes of thinking: System 1, fast and intuitive, and System 2, slow and deliberate. Shaw and Nave argue that AI now adds a third system to this account — one that, unlike a calculator or a search engine, participates in reasoning rather than just supporting it.

Across three preregistered experiments with 1,372 participants, the researchers randomised whether the AI assistant gave correct or incorrect answers. When the AI was right, performance rose by 25 percentage points above baseline. When it was wrong, performance fell 15 percentage points below baseline — that is, well below the level participants achieved with no AI access at all. Access to AI raised participants’ confidence in their answers — including answers that were incorrect.

The researchers call this pattern cognitive surrender — the tendency to adopt AI output without critical evaluation, substituting it for reasoning rather than using it to support reasoning.

Cognitive offloading versus cognitive surrender

Shaw and Nave distinguish cognitive surrender from cognitive offloading. Offloading is deliberate: the user delegates a discrete task to a tool while their own thinking remains engaged. Surrender is different — the user accepts an answer without scrutinising it, because it arrived fluently and the threshold for questioning was never reached.

Shaw and Nave found that participants with lower need for cognition and higher trust in AI were most susceptible to cognitive surrender. Those who pushed back on AI output were not the ones who were sceptical of AI in particular — they were the ones with the habit of thinking critically, regardless of the source.

Judgement and decision making at work

The conditions that shape professional judgement have been studied for decades.

Anders Ericsson’s work on deliberate practice established that competence in complex domains is built through sustained, effortful engagement with difficult problems — with feedback, and crucially with failure. The cognitive struggle is not a cost to be minimised. It is the mechanism through which capability is developed and maintained. When AI handles the effortful parts of analytical work — surfacing patterns, drafting interpretations, generating narratives — it removes precisely the kind of engagement through which expertise is built and sustained.

Lisanne Bainbridge identified the same dynamic in industrial automation in 1983. Automating most of a system’s tasks leaves operators responsible for the cases that cannot be automated, while the skills they need at those moments atrophy from disuse. The same pattern is visible in AI-assisted knowledge work. Vicente and Matute (2023), publishing in Scientific Reports, found that exposure to a biased AI led participants to make more diagnostic errors than a control group — and the effect persisted after the AI was removed.

Daniel Wegner’s concept of transactive memory describes the way teams distribute knowledge across members. People have always offloaded cognition to other people, but the AI version of the dynamic is missing a feature that matters: human colleagues hesitate, qualify, and mark their uncertainty in ways that AI typically does not.

The fluency of AI output is not incidental. Brady and colleagues, in a 2025 review in Nature Reviews Psychology, document that large language models are trained through reinforcement learning from human feedback, a process that systematically rewards confident-sounding responses regardless of whether they are correct. The cues that would normally prompt scrutiny — hesitation, qualification, expressed uncertainty — are precisely the cues the training process selects against.

What activates scrutiny of AI

Four strands of research are useful here.

Confidence

A 2025 Microsoft Research study of 319 knowledge workers using generative AI at work found that the more confident a worker was in the AI’s capability, the less critical thinking they applied to its output. Higher confidence in their own ability worked the other way — those workers scrutinised AI output more, not less.

Training

Automation bias — the tendency to favour outputs from automated systems over independent judgement — occurs in naive and expert participants alike. Parasuraman and Manzey’s foundational 2010 review in Human Factors found that it cannot be prevented by general training or instructions. The one method shown to reduce it is direct exposure to automation failures — workers learning, through experience, that the system gets things wrong. The implication for AI is that experiencing the tool’s failures may matter more than being taught about them.

Accountability

Philip Tetlock’s research on expert judgement distinguishes between two kinds: outcome accountability, where someone is judged on the result, and process accountability, where they are expected to justify how they arrived at it. Outcome accountability, when the source of the output appears credible, can produce strategic deference: workers reason that the AI is more likely to be right than their modifications, and leave the output untouched. Process accountability — having to show the reasoning, not just the conclusion — is what reliably activates effortful deliberation.

Conditions

Shaw and Nave tested several conditions that might shift cognitive surrender. Time pressure made it worse. Real-time feedback combined with performance incentives reduced it — when participants had reason to care about accuracy and could see how they were doing, they were more likely to override faulty AI.

What this means for organisations

The research on automation bias, expertise, and AI use has more to say about what causes cognitive surrender than about what reliably prevents it in the workplace. Most of the evidence cited above comes from controlled experiments and reviews. Three directions for translating it into workplace practice are nonetheless worth considering.

Training design. Workers scrutinise AI less when they are confident in its capability — particularly on tasks where its output sounds fluent. General training has not been shown to reduce automation bias; only direct exposure to automation failures has. Whether the same logic transfers to AI is plausible but not yet directly demonstrated. The implication for training is that running workers through tasks where the AI is known to be wrong may matter more than instructing them in good practice. The aim is to make workers’ confidence track the AI’s actual accuracy.

Reviewing reasoning. Where review processes focus only on the final output, the reviewer may not see the points at which the worker deferred to AI rather than tested its output. Asking workers to articulate what the AI suggested, what they changed, and why, may surface the moments where deliberation broke down. This is closer to Tetlock’s process accountability than to outcome review — though how much it changes behaviour in real workflows is an open question.

Workflow design. Time pressure without feedback made cognitive surrender worse in Shaw and Nave’s experiments. The workplace equivalent of real-time feedback is rarely available, but where errors in AI-assisted work do surface — through downstream review, customer complaints, audit findings — getting that information back to the people who made the decisions, quickly, is the closest workplace analogue to what reduced cognitive surrender in the experiments.

The bottom line

None of this is an argument against AI-assisted work. The same paper that documents cognitive surrender also finds substantial performance gains when AI is accurate, and those gains hold under time pressure.

The research makes a narrower case: that AI changes how professional judgement is exercised, in ways the worker may not notice, and that keeping judgement engaged requires specific conditions.

AI does not signal when it is wrong. Working out how to compensate for that is a hard question, and not one with established answers.

References

Bainbridge, L. (1983). Ironies of automation. Automatica, 19(6), 775–779.

Brady, O., Nulty, P., Zhang, L., Ward, T.E. & McGovern, D.P. (2025). Dual-process theory and decision-making in large language models. Nature Reviews Psychology, 4, 777–792. https://www.nature.com/articles/s44159-025-00506-1

Ericsson, K.A., Krampe, R.T. & Tesch-Römer, C. (1993). The role of deliberate practice in the acquisition of expert performance. Psychological Review, 100(3), 363–406.

Lee, H-P., Sarkar, A., Tankelevitch, L., Drosos, I., Rintel, S., Banks, R. & Wilson, N. (2025). The impact of generative AI on critical thinking: self-reported reductions in cognitive effort and confidence effects from a survey of knowledge workers. Proceedings of the 2025 CHI Conference on Human Factors in Computing Systems. https://doi.org/10.1145/3706598.3713778

Parasuraman, R. & Manzey, D.H. (2010). Complacency and bias in human use of automation: an attentional integration. Human Factors, 52(3), 381–410.

Shaw, S.D. & Nave, G. (2026). Thinking — Fast, Slow, and Artificial: how AI is reshaping human reasoning and the rise of cognitive surrender. Wharton School Research Paper. https://ssrn.com/abstract=6097646

Tetlock, P.E. (1983). Accountability and the perseverance of first impressions. Social Psychology Quarterly, 46(4), 285–292.

Vicente, L. & Matute, H. (2023). Humans inherit artificial intelligence biases. Scientific Reports, 13, 15737. https://www.nature.com/articles/s41598-023-42384-8

Wegner, D.M. (1987). Transactive memory: a contemporary analysis of the group mind. In Mullen, B. & Goethals, G.R. (Eds.), Theories of Group Behavior.

‍