Ownership model

	PM	ML Engineer	Domain Expert	QA
Judge calibration	I	A	C	R
Ship decision	A	C	C	I

Item ID	Description	Risk	Owner	Target	Status
DEBT-001	4 judges uncalibrated (6 months)	H	ML Engineer	Mar 15	Open
DEBT-002	Regression suite incomplete (3 query types not tested)	M	Product Eng	Mar 30	Open
DEBT-003	Thresholds only in Slack	M	PM	Feb 28	Open

Component	Finding	Status
4 LLM judges	Last calibrated at launch	TPR/FPR unknown
Regression suite	80 examples, no updates	3 new query types not covered
Monitoring dashboard	Exists, but no one assigned	No one watching
Metric thresholds	Documented in Slack thread	New team members can't find them

Trigger	Severity	Escalate From	Escalate To	Response Time
Safety metric fails (policy violation)	Critical	On-call Eng	PM + ML Eng	Immediate (1 hour)
Primary metric degrades >5% over 7 days	High	ML Engineer	PM	24 hours
Judge drift >10%	High	ML Engineer	PM	48 hours
Experiment shows segment regression	Medium	Data Scientist	PM	Before ship decision

Can one person be both R and A? Is uncalibrated judge low risk or high risk? When do you escalate?

Question 1

Your RACI lists ML Engineer as both R and A for judge calibration, and Domain Expert as A for threshold setting. What is wrong?

Question 2

Debt item: Judge uncalibrated for 6 months. Classify as low risk because judge was accurate at launch and metrics look stable. Is this correct?

Question 3

Monitoring shows SQL success rate dropped from 74% to 68% over 5 days. Escalation trigger: >5% degradation over 7 days. Do you escalate now or wait?