Workspace/

Overview

Hallucination rate
8.7%
-1.2 ptsvs last week
Faithfulness score
87
+0.8vs last week
Tests run
12,340
+218 this week
Critical fails
6
+1 this week

Hallucination rate over time

Lower is better · last 7 days
Model A
Model B
Model C
07142128MonTueWedThuFriSatSun

By domain

This week's faithfulness
Medical12 runs
91
Legal8 runs
64
Finance15 runs
82
General23 runs
88

Recent runs

12 runs · click any row for details
SuiteModelMetricScoreTrendStatusWhen
Medical · drug interactions
218 cases · run-001
Model BFaithfulness
94
Pass2 min ago
Legal · contract clauses
156 cases · run-002
Model AFactual consistency
87
Pass14 min ago
Finance · earnings Q&A
412 cases · run-003
Model AHallucination rate
71
Warning38 min ago
Medical · symptom triage
98 cases · run-004
Model CAnswer relevance
64
Warning1h ago
General · world facts (v3)
600 cases · run-005
Model BCitation accuracy
91
Pass1h ago
Legal · case-law lookup
240 cases · run-006
Model CFaithfulness
38
Fail2h ago
Finance · 10-K summarization
312 cases · run-007
Model AFaithfulness
82
Pass3h ago
Medical · dosage calculation
84 cases · run-008
Model BFactual consistency
96
Pass4h ago
General · trivia (long-tail)
1,024 cases · run-009
Model CHallucination rate
52
Warning5h ago
Legal · GDPR compliance
178 cases · run-010
Model ACitation accuracy
89
Pass6h ago
Finance · risk disclosures
290 cases · run-011
Model BAnswer relevance
Runningrunning
Medical · patient education
144 cases · run-012
Model CFaithfulness
33
Failyesterday