← All lessons
Browse lessons

Week 2: Instrumentation and Reliability Engineering · Lesson 2.3

Trace design and reproducibility

If we need to explain or reproduce a behavior, what must be true about our trace design?

Retired course. Due to the fast pace of AI, this course was retired before full release. Exercises, datasets, and videos referenced in this lesson are not available. The slide content and frameworks remain free to study.

Slide 1 of 19

Reader Notes

This lesson covers designing the trace schema that makes every failure type measurable. The difference between v0 instrumentation, where logs exist but tell nothing, and v1 instrumentation, where retrieval recall can be proven to have dropped from 0.89 to 0.67 in the last 24 hours, is enormous. Consider that difference. v0 shows "success: true" but gives zero insight into which stage actually failed. v1 captures every stage separately with its own success indicators and latency measurements. That's the gap this lesson closes. This is about designing what the system records so there's evidence to make ship decisions. The fields chosen for capture determine which questions can be answered later. If the SQL query isn't logged, SQL correctness can't be measured. If retrieval document IDs aren't logged, retrieval precision can't be computed. What gets recorded isn't just about clean data; it's about building the evidence foundation that enables confident ship decisions.

Go deeper with AI Analytics for Builders

5-week course: metrics, root cause analysis, experimentation, and storytelling. Think like a Product Data Scientist.

Book 1-on-1 with Shane

30-minute AI evals Q&A. Talk through your specific evaluation challenges and get hands-on guidance.

Finished all 36 lessons? Take the exam and get your free AI Evals certification.