Human-AI Interaction & Decision Quality

How effectively are humans and AI actually working together in your context? Benchmark your decision acceptance rates, automation bias exposure, and collaboration quality against global research from McKinsey, BCG, Stanford HAI, and MIT — adjusted for your industry, role, and decision type.

Industry
Role Level
Decision Type
Decision Acceptance Rate
Automation Bias Index accepted without review
AI Trust Score composite /100 · Edelman baseline 53
Augmentation Preference prefer AI as collaborator vs autonomous
Decision Quality Lift vs solo human baseline · McKinsey
Time-to-Decision faster with AI assistance

Decision Acceptance Funnel

Journey from AI recommendation → meaningful review → accepted → acted on.

AI recommendations generated
Meaningfully reviewed
Accepted & acted on
Automation bias (auto-accepted)

Automation vs Augmentation Split

Optimal balance for this decision type. Centaur model (human+AI) outperforms either alone by .

Full Automation
Human-AI Augmentation

Decision Quality Dimensions

Five-dimension quality profile vs industry benchmark. Scores reflect improvement vs solo-human baseline.

Your profile
Industry benchmark

Acceptance Rate by Industry (BCG / Stanford HAI 2023 · optimal zone 65–85%)

What each metric measures, what the research says, and how to improve your collaboration posture.

Decision Acceptance Rates & Automation Bias

What It Measures

Decision Acceptance Rate tracks the proportion of AI recommendations that humans act on. Automation Bias Index measures how many of those acceptances occurred without meaningful human review — the silent governance failure that most organisations have not yet instrumented. The EEOC and EU AI Act both require documented human oversight for high-risk decisions; automation bias is evidence that oversight is nominal rather than real.

Global Benchmarks
  • Healthcare: 79% acceptance, 31% automation bias — radiologists accepting AI diagnostic flags without independent verification (MIT CSAIL 2023)
  • Financial Services: 68% acceptance, 42% automation bias — highest bias rate across sectors; credit officers approving AI-scored applications at volume without case review (BCG 2023)
  • Legal: 61% acceptance, 23% bias — most conservative sector; liability exposure drives genuine review (Stanford HAI)
  • Optimal zone: 65–85% acceptance with <25% automation bias. Below 50% = undertrust; above 85% = over-reliance
  • DARPA XAI finding: providing explanations with AI recommendations reduces automation bias by 18% — the single most effective intervention
How to Improve
  1. Instrument your AI systems to log whether humans accessed the explanation before accepting — this is your automation bias rate
  2. Add mandatory explanation display before acceptance for P1/P0 decisions — interface friction that requires acknowledgement, not just click-through
  3. Set a review SLA: for high-stakes decisions, require logged time-on-task before acceptance (>60 seconds minimum)
  4. Report acceptance rates by team to leadership monthly — the act of measurement alone reduces automation bias by 12% (MIT Sloan 2022)

Automation vs Augmentation Spectrum

What It Measures

The automation vs augmentation split defines how AI is deployed across a decision portfolio. Automation means AI decides and acts without human involvement. Augmentation (the "centaur model") means AI advises, humans decide. The optimal split is not fixed — it varies critically by decision type, reversibility, regulatory context, and cognitive stakes. Getting this wrong in either direction destroys value: over-automation creates liability and error propagation; under-automation wastes the tool.

Global Benchmarks
  • Centaur model outperformance: Human+AI teams beat solo AI by 23% and solo humans by 31% on complex decisions — BCG 2023 study of 12,000 knowledge workers
  • Routine decisions: 65% automation / 35% augmentation optimal — McKinsey Global Institute 2023
  • Complex decisions: 25% automation / 75% augmentation — Stanford HAI recommendation
  • High-stakes decisions: 8% automation / 92% augmentation — any fully automated high-stakes decision is a governance violation under NIST AI RMF
  • Augmentation preference: 71% of knowledge workers prefer AI as thought-partner (McKinsey 2023); this rises to 84% for healthcare workers and 88% for legal professionals
How to Calibrate
  1. Map every AI deployment to one of three tiers: Automate (routine, reversible, low-stakes), Augment (complex, consequential, regulated), Advise-only (irreversible, high-liability, safety-critical)
  2. Calculate your Return on Employee (RoE): measure hours freed from automated tasks + decision quality improvement per person — this is the centaur dividend
  3. Resist the automation bias in system design — the default should be augmentation, with automation requiring explicit justification and governance sign-off
  4. Survey team augmentation preference quarterly — low preference scores predict adoption failure before it happens

Decision Quality & Cognitive Load

What It Measures

Decision quality in human-AI systems is multi-dimensional: accuracy (correctness), speed (time-to-decision), consistency (same decision in the same context), error rate (critical failures), and cognitive load (mental effort required). The HAIS (Human-AI Integration Scale), developed by researchers at MIT and Northeastern, provides a validated 25-item instrument for measuring how well AI integration serves human cognition rather than taxing it.

Global Benchmarks
  • Accuracy lift: +18% average improvement in AI-assisted vs solo human decisions; up to +22% in healthcare (BCG/MIT 2023)
  • Error reduction: −37% critical errors in healthcare AI with human review; −22% in financial services (Stanford HAI 2023)
  • Time-to-decision: −28% faster on average; routine decisions faster by 45%; high-stakes decisions faster by only 12% (appropriate caution)
  • Decision consistency: +31% improvement — AI dramatically reduces "decision fatigue" variance; humans make worse decisions in the afternoon, AI does not
  • Cognitive load: −24% reduction in perceived mental effort when AI provides structured options vs open-ended assistance (HAIS scale validation studies)
  • Trust Score (Edelman 2024): Global average 53/100; healthcare 62; financial services 49; legal 44
How to Measure
  1. Deploy the HAIS scale as a quarterly 25-item survey — it takes 8 minutes and produces a validated composite score across trust, transparency, control, and explainability dimensions
  2. Establish baseline accuracy and error rates before AI deployment — you cannot measure lift without a pre-AI baseline captured in the same period
  3. Track decision consistency using the same case presented to the same person twice over 4 weeks — the variance is your "human inconsistency baseline" that AI should reduce
  4. Monitor cognitive load as a leading indicator of adoption failure — high perceived effort predicts abandonment within 90 days, well before accuracy drops become visible

Get the Human-AI Collaboration Assessment Template

The 38-point assessment template used to evaluate human-AI interaction quality across your organisation — covering decision acceptance protocols, automation bias audit, augmentation framework design, and the HAIS survey instrument for measuring cognitive integration.

  • Decision acceptance rate tracker with automation bias audit (12 items)
  • Automation vs augmentation decision matrix for your use-case portfolio
  • Abbreviated HAIS instrument (25-item validated survey, 8 minutes)
  • Return on Employee (RoE) measurement framework for AI-assisted roles

No spam. Unsubscribe any time.