Skip to content

The 7 failure modes

Through analysis of documented chatbot failures, customer complaints, and industry research, chatbot-auditor identifies seven specific, detectable failure modes that account for the vast majority of AI customer-service failures.

Each mode is:

  • Named — so teams can discuss and track it
  • Defined — with concrete signals, not vague intuitions
  • Detectable — implementable in code, testable against data
  • Actionable — with a specific recommended response

Death Loop

The customer asks a question, the bot gives an answer, the customer rephrases, the bot gives the same answer. This repeats 3–5 times until the customer gives up. The bot records the conversation as "resolved."

Detection. Pairwise similarity between bot responses exceeds a threshold (default 0.85) for at least min_repeats (default 3) messages. Grouped by connected components so interleaved variants still unify into one loop.

Action. escalate — immediately route to a human agent with full conversation context.

Note. The most common and most detectable failure mode. Estimated to account for 30–40% of all chatbot failures.


Silent Churn

The conversation ends without resolution but is marked as "resolved" by the bot. The customer didn't complain, didn't ask for escalation, didn't get angry — they just left. This is the most dangerous failure mode because it's completely invisible to the company.

Detection. The conversation has substantive user activity (≥2 user messages), the bot responded (≥2 bot messages), no user message contained a confirmation keyword (thanks, got it, perfect, …), and the conversation ended. Severity is raised when reported_resolved=True.

Action. follow_up — trigger an automated re-engagement message.


Escalation Burial

The customer explicitly asks for a human agent, but the bot deflects them back into the automated flow. "Let me see if I can help with that" instead of transferring. Some systems bury the human option behind 3–4 layers of prompts.

Detection. User message contains explicit escalation language (speak to a human, real person, manager, …). The next bot or agent message is checked — an agent message or a transfer-confirmation phrase counts as handled; anything else is a burial. Multiple burials in one conversation raise severity to critical.

Action. override — the highest-urgency action because the customer already asked for help and is being actively prevented from getting it.


Sentiment Collapse

Customer frustration escalates through the conversation. Messages get shorter, more emotional, contain explicit frustration markers. The bot continues with cheerful, scripted responses, oblivious to the emotional state.

Detection. Each user message is scored on a [-1.0, 1.0] sentiment axis. The average of the first third is compared to the average of the last third — a drop of at least 0.4, combined with a late-third average below -0.2, triggers a detection. Confidence scales with the magnitude of the drop.

Action. escalate — catch the emotional decline BEFORE the customer explicitly asks for a human.

Note. Uses keyword-based sentiment by default (zero dependencies). A transformer or LLM sentiment scorer can be injected for subtler detection.


Confident Lies

The bot makes commitments outside company policy — promising refunds, discounts, features, or timelines that don't exist.

Real-world examples.

  • Air Canada was legally ordered to honor a refund its chatbot fabricated.
  • Cursor's support bot invented a cancellation policy that triggered mass cancellations before anyone noticed.

Detection. Regex patterns match future-tense commitments (I'll process, you will receive, we guarantee, within X days, free of charge, I've cancelled). With a PolicyBase, each commitment is classified as allowed, disallowed, or unverifiable. Without one, all commitments are flagged for review.

Action. flag — human review before the commitment is honored.

Note. The highest-LIABILITY failure mode.


Brand Damage

The bot says something that could go viral, embarrass the company, or violate brand guidelines.

Real-world example. DPD's chatbot swore at a customer and wrote a poem about how terrible the company was. The screenshots were viewed over a million times on social media.

Detection. The default pattern-based checker scans for four categories:

  • Profanity — explicit language
  • Self-deprecation — bot badmouths the company
  • Competitor endorsement — bot praises or recommends named competitors
  • Off-brand content — jokes, poems, opinions, politics

Action. alert — notify brand and support leadership immediately.

Note. Low frequency but catastrophic impact. A custom safety checker (e.g. LLM-based) can be injected for deeper analysis.


Confident Misinformation

The bot states facts about the company's products, pricing, policies, or availability that are wrong. Not hallucinated promises (Mode 5) but factual errors — wrong prices, discontinued products listed as available, incorrect store hours, outdated policies.

Detection. Regex patterns detect factual-claim language (pricing, hours, availability, policy windows). With a FactBase, each claim is cross-checked against ground truth — a statement that touches a known topic but doesn't include the truth is a contradiction. Without one, all confident factual claims are flagged for review.

Action. flag — human review with specific discrepancy highlighted.

Note. Most valuable for e-commerce and companies with frequently-changing information.


How they co-occur

In real chatbot conversations, failure modes rarely appear alone. A death loop typically ends in silent churn. Sentiment collapse often precedes an escalation burial. chatbot-auditor returns one detection per (conversation, detector) pair — a single conversation may produce several detections.

This is not noise. It's an accurate picture of how chatbot failures compound.


Next