Interference-aware design

Mechanism	Example	Interference direction
Shared resources	AI Data Analyst cache: v2 users populate cache, v1 users benefit	Control benefits from treatment → you underestimate the effect
Network effects	Recommendation system: treatment improves collaborative filtering for control	Control benefits from treatment → underestimation
Temporal carryover (effects persist over time)	Model retrains nightly using all users' data	Treatment leaks into control over time

Design	Effect Size	95% CI	p-value	Valid?
Standard A/B (L5.3)	+2.9pp	[-0.3, +6.1]	0.07	✗ SUTVA violated
Switchback (this lesson)	+4.8pp	[+0.8, +8.8]	0.02	✓ Valid

Run autocorrelation check, compute switchback effect, test carryover, design marketplace clusters

⏱

Day-to-day correlation check

Check if consecutive days are correlated (reduces effective sample size) (Base: 5 min)

📊

Switchback analysis

Run switchback_analysis(), record effect + CI (Base: 8 min)

🔄

Carryover test

Filter v1 days by prior treatment, compare means (Base: 7 min)

🗂

Cluster design

Define clusters for marketplace interference, validate quality (Base: 10 min)

Extend version

Implement resampling method that respects time structure (blocked bootstrap), compute effective sample size, test different carryover from v1→v2 vs v2→v1 (+15 min)

	Spatial (users share resources at the same time)	Temporal (effects carry over time)
User-level (standard A/B)	✗ Biased estimate Example: Marketplace with shared supply	✗ Biased estimate Example: Online learning model
Time/space-level (switchback, cluster)	✓ Valid estimate Example: Shared cache, daily switching	✓ Valid estimate Example: Regional supply, cluster by geography

Interference-aware design

What assumption does user-level randomization rely on, and when might that assumption break?

Your cache helps performance but breaks your experiment

SUTVA assumes one user's outcome depends only on their own treatment

Shared resources, network effects, temporal carryover violate SUTVA

Cache spillover underestimates v2's true impact by 1.9pp

Switchback eliminates shared-resource interference by switching everyone at once

Day-to-day correlation inflates standard errors

If SUTVA holds, use standard A/B — it's simpler and more powerful

Will switchback show larger, smaller, or same effect as standard A/B?

Switchback reveals +4.8pp effect — standard A/B missed 1.9pp due to contamination

v1 days following v2 show +1.2pp carryover — acceptable for daily switching

Run autocorrelation check, compute switchback effect, test carryover, design marketplace clusters

Interference-Aware Experiment Design Brief changes ship decision from hold to ship

Assuming SUTVA holds by default is the costliest mistake

Model retrains nightly using all users' data — why does this violate SUTVA?

2x2 matrix: interference type × randomization design

Next: Launch Readiness

Interference-aware design

What assumption does user-level randomization rely on, and when might that assumption break?

Your cache helps performance but breaks your experiment

SUTVA assumes one user's outcome depends only on their own treatment

Shared resources, network effects, temporal carryover violate SUTVA

Cache spillover underestimates v2's true impact by 1.9pp

Switchback eliminates shared-resource interference by switching everyone at once

Day-to-day correlation inflates standard errors

Cluster by how users share resources, not randomly

If SUTVA holds, use standard A/B — it's simpler and more powerful

Will switchback show larger, smaller, or same effect as standard A/B?

Switchback reveals +4.8pp effect — standard A/B missed 1.9pp due to contamination

v1 days following v2 show +1.2pp carryover — acceptable for daily switching

Run autocorrelation check, compute switchback effect, test carryover, design marketplace clusters

Interference-Aware Experiment Design Brief changes ship decision from hold to ship

Assuming SUTVA holds by default is the costliest mistake

Model retrains nightly using all users' data — why does this violate SUTVA?

2x2 matrix: interference type × randomization design

Next: Launch Readiness