Draft
Preface
I Causal Inference: An Overview
1
Introduction
2
Correlation and Simpson’s Paradox
3
Randomized Experiment
3.1
Complete Randomization
3.2
Independent Randomization
3.3
Clustered Randomization
3.4
Analysis of Randomized Experiments as Two Sample Problem
4
Potential Outcomes Framework
4.1
Naive Estimation
4.2
Randomization and Unconfoundedness
4.2.1
Conditional Unconfoundedness, Matching and Covariates Balancing
4.3
Propensity Score
4.4
SUTVA
4.5
Missing Data and Weighted Samples
4.6
Missing Data Mechanisms and Ignorability
4.7
Importance Sampling
4.8
Inverse Propensity Score Weighting (IPW)
4.9
Doubly Robust Estimation
4.10
Bias-Variance Trade off and Covariates Overlap
4.11
Other Propensity Score Modeling Methods
5
Causal Graphical Model
5.1
Structural Equation Model, Causal Diagram and d-separation
5.2
the
do
operator
5.3
The Back-door Criterion
5.4
Causal Mechanism and the Front-door Criterion
5.5
General Identification Strategy
5.6
RCM vs. CGM
6
Regression-based Methods
II Large Scale Online Controlled Experiments
7
A/B Testing: Beyond Randomized Experiments
7.1
Special Aspects of A/B Tests
7.2
Instrumentation and Telemetry
7.3
Common Pitfalls
8
Statistical Analysis of A/B Tests
8.1
Metric
8.2
Randomization Unit and Analysis Unit
8.3
Inference for Average Treatment Effect of A/B Tests
8.4
Independence Assumption and Variance Estimation
8.4.1
Independence Assumption
8.4.2
Variance Estimation for Average and Weighted Average
8.5
Central Limit Theorem and Normal Approximation
8.6
Confidence Interval and Variance Estimation for Percentile metrics
8.7
p-Value, Statistical Power, S and M Error
8.7.1
p-Value
8.7.2
Statistical Power
8.7.3
Type S and Type M Error
8.8
Statistical Challenges
9
System Diagnosis and Quality Checks for A/B Tests
9.1
System Validation using A/A Test
9.2
Sample Ratio Mismatch
9.3
Trigger and Filter Condition
9.4
Interaction Detection
9.5
Metric Denominator Mismatch
10
Improving Metric Sensitivity
10.1
Metric Sensitivity Decomposition
10.2
Variance Reduction
10.2.1
Control Variates and CUPED
10.2.2
General Regression Adjustment and Doubly Robust Estimation
10.2.3
Doubly Robust Estimator
11
Misc Topics
11.1
Delta Method
11.2
Random Denominator for Independent Randomization Experiments
11.3
M-Estimator and Z-Estimator
III Appendix
12
Probability Minimum
12.1
probability
12.1.1
Conditional Independence
Alex Deng
Causal Inference and Its Applications in Online Industry
Chapter 6
Regression-based Methods
IV
Regression discontinuity design
Diff-in-diff and in general synthetic control methods