Browse lessons

Week 1: Foundations and Economics

Week 2: Instrumentation and Reliability Engineering

Week 3: Rigorous Measurement of Output Success and Failure

Week 4: Metric Design and Business Outcome Linkage

Week 5: Pipelines, Experiments, and Continuous Validation

Week 6: Decision-Making and Organization

Week 5: Pipelines, Experiments, and Continuous Validation · Lesson 5.4

Launch readiness and rollout gates

What must be true before exposure, and what do we do if it degrades?

Retired course. Due to the fast pace of AI, this course was retired before full release. Exercises, datasets, and videos referenced in this lesson are not available. The slide content and frameworks remain free to study.

Slide 1 of 19

Reader Notes

This is the lesson that fixes the thing that made the experiment produce a biased estimate. In Lesson 5.3, an A/B test was designed for the v1-to-v2 change. The result was borderline: +2.9 percentage points with a confidence interval that included zero. The decision was "hold for more data." But that estimate was wrong. The system has a shared cache: when treatment users populate the cache with better schema definitions, control users benefit. The control group was contaminated. The real effect is larger. This lesson covers how to diagnose violations of the independence assumption (when one user's treatment affects another user's outcome) and choose the right experimental design when standard A/B breaks. The starting point is recalling the assumption made when designing the user-level randomization experiment in Lesson 5.3.

Go deeper with AI Analytics for Builders

5-week course: metrics, root cause analysis, experimentation, and storytelling. Think like a Product Data Scientist.

See full curriculum

Book 1-on-1 with Shane

30-minute AI evals Q&A. Talk through your specific evaluation challenges and get hands-on guidance.

Book 1-on-1 session

★

Finished all 36 lessons? Take the exam and get your free AI Evals certification.

→