Knowledge bases¶
Two of the seven detectors — ConfidentLiesDetector and ConfidentMisinformationDetector — become dramatically sharper when given a knowledge base describing what the bot SHOULD be saying. Without one they still run and flag candidates for review; with one they can distinguish truth from fiction.
PolicyBase — for Confident Lies¶
A PolicyBase describes the commitments the bot is ALLOWED to make on behalf of the company, and the ones it explicitly is NOT allowed to make.
from chatbot_auditor import PolicyBase, ConfidentLiesDetector
policy = PolicyBase.from_iterables(
allowed=[
"within 30 days",
"full refund for damaged goods",
"free shipping on orders over $50",
],
disallowed=[
"lifetime warranty",
"price match",
"same-day delivery guarantee",
],
)
detector = ConfidentLiesDetector(policy=policy)
What happens¶
For each bot message that contains a commitment pattern (I'll process, you will receive, within X days, …):
| Match | Outcome | Severity |
|---|---|---|
Contains a disallowed_commitment_topics substring |
Violation | critical |
Contains an allowed_commitments substring |
Verified — skipped | — |
| Neither matches | Unverifiable | high |
Without a PolicyBase, every commitment is reported as unverifiable at medium severity. That's useful as a sentinel — you can audit what the bot is promising without having a policy yet.
Realistic policies¶
Policies are often more subtle than simple substring matches. For production use, a PolicyBase can be generated from:
- Your help center pages
- A compliance-approved FAQ
- An LLM summarizing your terms of service
- A vector store queried at detection time (advanced — you'd implement a custom detector that subclasses
ConfidentLiesDetectorand overrides the policy-checking logic)
Start simple: populate the obvious allowed/disallowed phrases, then iterate based on what the unverifiable flag catches.
FactBase — for Confident Misinformation¶
A FactBase is a mapping of topic → ground-truth fact. When a bot message mentions a known topic but doesn't include the correct fact, the detector flags a contradiction.
from chatbot_auditor import FactBase, ConfidentMisinformationDetector
facts = FactBase(
facts={
"store_hours": "9 AM to 6 PM on weekdays",
"return_window": "30 days",
"shipping_cost": "$12 for standard",
"premium_price": "$29.99 per month",
},
aliases={
"store_hours": ("opening hours", "hours", "when you open"),
"return_window": ("return policy", "refund window"),
},
)
detector = ConfidentMisinformationDetector(facts=facts)
What happens¶
For each bot message:
| Condition | Outcome | Severity |
|---|---|---|
| Touches a known topic AND contains the ground truth | Verified — skipped | — |
| Touches a known topic but the ground truth is absent | Contradiction | critical |
| Matches a factual-claim pattern but no known topic | Unverifiable | high (with FactBase) or medium (without) |
Aliases matter¶
Customers don't always ask for topics using the exact topic name. aliases["store_hours"] = ("opening hours", "hours") lets the detector recognize that a bot message talking about "our opening hours" is on the store_hours topic.
How sharp the detector is, tunes itself to how rich the KB is¶
An empty FactBase behaves identically to no FactBase. A FactBase with 5 facts catches contradictions on those 5 topics. A FactBase with 500 facts catches 500 times as much. Investment-to-value is linear and transparent.
Advanced: live knowledge bases¶
PolicyBase and FactBase are plain dataclasses. Nothing stops you from populating them from a live source at detection time:
from chatbot_auditor import ConfidentMisinformationDetector, FactBase
def current_facts() -> FactBase:
return FactBase(facts={
"store_hours": fetch_from_cms("hours"),
"premium_price": fetch_from_billing_api("premium/price"),
# ...
})
detector = ConfidentMisinformationDetector(facts=current_facts())
For even more flexibility, write a custom detector that queries a vector store or LLM per claim. See Write a custom detector.