
After more than two decades working alongside HR teams, we have seen genuine commitment to diversity, equity and inclusion. We have seen organisations invest in training, update their policies, and communicate their commitments openly — yet too often the work fails to translate into tangible differences for the people it is meant to benefit.
The problem is not a lack of commitment. It is that the interventions being used are not the most effective ones available — and the evidence on what works is clearer than many realise.
Unconscious bias training has become one of the most widely used DEI interventions. According to the 2024 Hays Diversity, Equity and Inclusion Report, 54% of UK organisations now provide it to hiring managers and interviewers — up from 34% in 2019.
It is also, according to the best available evidence, among the least effective approaches for producing meaningful change in workplace behaviour.
A UK government review by the Behavioural Insights Team found there is currently no evidence that unconscious bias training changes behaviour or improves workplace equality in terms of representation of minorities in positions of leadership or reducing pay inequalities. That the intervention is growing in uptake while the evidence for its effectiveness remains weak is precisely the problem.
A recent meta-analysis by Elaine Costa, published in the Journal of Applied Psychology, examined 70 peer-reviewed studies covering workplace interventions and intergroup bias — producing 208 separate effect sizes for analysis. The findings are interesting.
Costa categorised interventions into four types. Three of these are active — they directly target specific dimensions of bias (our thoughts, our feelings, or our behaviours). The fourth, which she called 'educating about bias processes', covers initiatives that raise awareness and provide information about bias without directly targeting any specific dimension. Unconscious bias training, in its most common form as a standalone awareness intervention, falls squarely in this fourth category.
The analysis found that educating about bias is consistently less effective than active interventions across all outcome measures. It produces small effects at best. When used as the primary or sole intervention — which it frequently is — it is unlikely to produce meaningful change in the behaviours that matter most.
This does not mean awareness-raising has no value. It means it is not sufficient on its own, and in many organisations it is doing the heavy lifting it was never designed to carry.
Costa's analysis identified a clear principle: interventions work best when they target the same dimension of bias they are trying to change. Cognitive bias (stereotypes) responds to cognitive interventions. Affective bias (prejudice, negative feelings toward a group) responds to affective interventions. Behavioural discrimination responds to behavioural interventions. Misalignment between what an intervention targets and what you are measuring produces weaker effects — a finding that has significant implications for how organisations design and evaluate their DEI programmes.
Three approaches stand out from the evidence.
Of all the interventions Costa examined, accountability produced the best overall results for reducing discrimination in the workplace — the behavioural manifestation of bias that causes the most direct harm. Accountability mechanisms require evaluators to explain or justify their ratings, decisions, and conclusions. When people know they will need to account for their choices, they are significantly less likely to act on stereotypes or biases.
In practice, this means building justification into evaluation processes as standard — not as an exception or an audit mechanism. Hiring panels that document their reasoning. Promotion decisions that require written rationale. Pay review processes with structured sign-off. These are not bureaucratic additions; they are interventions with a strong evidence base.
Within the category of interventions targeting stereotypes and cognitive bias, structured evaluations were the most effective approach. Structured evaluation means using consistent criteria and anchored rating scales across all candidates or employees being assessed — reducing the ambiguity that allows stereotypes and heuristics to fill the gap. CV anonymisation is a simpler variant of the same principle.
For reducing prejudice — the affective or emotional dimension of bias — the most effective intervention in Costa's analysis was imagined contact. This approach asks evaluators to envision a positive interaction with a member of a target group before they make an assessment. It sounds almost too simple, and it is certainly less visible than a training programme. But the evidence for its effect on reducing prejudice is robust.
Perspective-taking exercises fall in the same category. These are low-cost, low-visibility interventions that have stronger evidence behind them than much of what organisations currently invest in.
The Costa framework does more than identify which interventions to prioritise — it tells you how to evaluate whether they are working. Because each intervention category targets a specific dimension of bias, meaningful evaluation requires measuring that same dimension. This has a direct implication worth stating plainly: a survey will not always be the right evaluation tool.
Accountability mechanisms are designed to reduce discriminatory behaviour in decisions. Behavioural change of this kind is often better evaluated through audit — examining promotion rates, pay review outcomes, or shortlisting data — than through a survey. Structured evaluation targets stereotypes in the cognitive dimension, and here a survey can be useful, but only if it asks specifically about how consistently and fairly evaluation processes are experienced in practice, not just whether employees feel included overall. Affective interventions like imagined contact are designed to reduce prejudice — and for these, attitudinal survey measures are appropriate, provided they are designed to capture the affective dimension rather than behavioural intentions or overall satisfaction.
The practical implication is that designing your evaluation approach means starting with the question 'what dimension are we trying to shift?' and working forward from there — rather than defaulting to an annual employee survey and hoping it captures what you need to know.
When a survey is the right tool, two things determine whether it will tell you anything useful. The first is whether the questions are targeted at the right dimensions — measuring experienced fairness of process, consistency of evaluation, and psychological safety rather than global inclusion sentiment. The second is whether you can disaggregate the data. Inclusion is not experienced uniformly across an organisation, and an overall score may look reassuring while masking significant variation between groups. This means collecting protected characteristic data is not optional if DEI measurement is to mean anything.
This is where organisations encounter a practical difficulty that anyone running surveys in this space will recognise. Non-disclosure rates on demographic questions can run as high as 30% in some surveys. The temptation is to treat this as an administrative inconvenience, but it is more usefully read as a signal. Employees who do not trust how their data will be used, or who doubt that disclosure will lead to meaningful action, are disproportionately likely to select 'prefer not to say'. The groups whose experience most needs to be understood may be the least visible in your analysis — and a high non-disclosure rate is itself an inclusion finding, worth investigating rather than working around.
Building trust in the data collection process — being explicit about how data will be stored, who will see it, at what level it will be reported, and what it will be used for — is not a survey administration detail. It is part of the intervention.
If you are reviewing your DEI intervention mix — or evaluating a provider's approach — we would suggest asking two questions.
Stereotypes, prejudice, and discrimination are different things and respond to different interventions. An organisation with a gender pay gap problem is primarily dealing with a behavioural issue in evaluation and progression decisions — accountability and structured evaluation are the right tools. An organisation with low belonging scores among ethnic minority employees may have a more affective challenge that requires different approaches. Being clear about which dimension you are targeting matters.
If your primary DEI metric is attendance at training, you are measuring input, not outcome. This is more common than it should be — not because organisations are complacent, but because outcomes are harder to measure and take longer to show up.
As the Costa framework makes clear, the right evaluation approach depends on what you are trying to shift. If your focus is on reducing discriminatory behaviour in hiring and progression decisions, the most meaningful evidence will come from auditing outcomes — tracking whether shortlisting rates, promotion rates, and pay review decisions are changing over time across different groups. A survey will not tell you this.
If your focus is on reducing stereotyping in evaluation processes, a survey can help — but only if it asks specifically about the consistency and fairness of those processes as people experience them, not just whether they feel included in general.
If your focus is on reducing prejudice through affective interventions, attitudinal measures are appropriate — but they need to be designed to capture how people feel toward different groups, not just their overall satisfaction with the workplace.
The honest question to ask is whether your current measurement approach is actually capable of detecting the change you are trying to produce — or whether it would show positive results regardless of whether anything had really shifted.
DEI is not an area where good intentions are enough. The evidence on what works is increasingly clear — and increasingly at odds with how most organisations are spending their DEI budget. Shifting investment from passive awareness-raising toward accountability mechanisms, structured evaluation, and targeted affective interventions will not make for an easy sell internally. But it is what the evidence supports.
Evaluation is the bridge between good intentions and genuine change. Knowing which tool is right for the job — and designing it to surface the signals that actually matter — is where much of the real work lies.
If you are reviewing your approach to DEI measurement, or want to understand how employee survey design can better surface the experiences of underrepresented groups, we would be glad to talk. Contact us for a no-obligation conversation.
Let’s start a conversation about how employee surveys can help you develop a workplace where people and performance grow together.