Unmeasured Failure Modes
Failures you suspect exist but haven't observed in traces
"Narrative hallucination rate unknown"
Rare Scenarios
Edge cases not in your trace sample
"No multi-turn sessions > 4 turns in sample"
Logging Blind Spots
System behaviors you CANNOT observe with current logging
"v0 doesn't log retrieved context"