Three judgment scenarios โ blocking vs. optimization in practice
Scenario 1
SQL correctness >95% (blocking). Change improves retrieval +10 percentage points and completeness +8 percentage points, but SQL drops to 93%. PM says overall quality is clearly better. Do you ship? Why or why not?
Scenario 2
Answer completeness = optimization metric, target >=80%, current 78%. Six months later: 83%, but PM says users aren't happier. What might this signal?
Scenario 3
Threshold: Latency p95 <5s because that's our SLA. Is this valid? What additional info would strengthen it, and which rationale source does it use?
Pause and write your answers
These scenarios test whether you understand blocking metrics, metric tree dependencies, and threshold rationale sources.