Stats Lab
Applied statistics from fundamentals through master’s-level. Excel, Stata, R, Julia, side by side.
Open teaching site · Excel · Stata · R · Julia · CC-BY-SA 4.0
Stats Lab
Applied statistics from fundamentals through master’s level. Twelve chapters covering probability, distributions, sampling theory, estimation, hypothesis testing, linear models, GLMs, nonparametrics, multivariate methods, Bayesian inference, and time series. Every concept paired with worked code in Excel first, then Stata, R, and Julia, in that order.
How to use this
The book is organized in five parts: Foundations, Inference, Linear Models, Modern Extensions, and Multivariate/Bayesian/Time Series. Within each chapter, the structure is the same: concept → math → worked example in Excel → Stata → R → Julia → common traps → reporting checklist → references.
If you’re learning statistics for the first time, read Chapters 1–6 in order. If you’re filling in gaps before a master’s exam, the diagnostics chapter (7) and the GLM chapter (8) are where most students lose ground, so start there. If you’re a practitioner who needs to brush up on a specific method, the chapter index in the sidebar is the fastest path.
The companion demos page has the browser-side interactives: a distribution sampler, a CLT animator, an OLS-by-hand visualizer, a bootstrap distribution generator, and more. No install required.
The lecture series
1 Foundations
Probability, distributions, and sampling theory. The bedrock that everything else assumes.
2 Inference
Estimation, intervals, and testing. The toolkit for turning a sample into a defensible claim about a population.
3 Linear models
OLS as geometry and as estimator. Diagnostics, robust standard errors, and the practical machinery.
4 Modern extensions
When the outcome is not continuous, the assumptions don’t hold, or the function is unknown.
5 Multivariate, Bayesian, time series
The methods most master’s curricula put in the second semester. PCA and clustering for high-dimensional data; Bayes for honest priors; ARMA for serially correlated outcomes.
D Interactive demos
Distribution samplers, CLT animators, an OLS-by-hand visualizer, bootstrap distributions, and more. Run them in your browser; no install.
Why Excel first, then Stata, R, and Julia
Most master’s students arrive comfortable in Excel, somewhat comfortable in one stats package, and rarely in two or more. The path through the four tools in this book reflects that. Excel teaches the concept (the formula is right there in the cell). Stata gives the production-grade pipeline most employers and journals expect. R is the modern lingua franca of academic statistics and the place most new methods land first. Julia is the speed and composability story for the future of the field. The chapter code is interleaved so you can use whichever tool fits your situation today, while the others are one click away.
Pedagogically, working a method four times in four languages is the single fastest way to internalize what’s mechanical, what’s conceptual, and what’s package-specific.
How this site is built
Quarto book with KaTeX math, copy-to-clipboard code, hand-built SCSS theme using Source Serif 4, Inter, and JetBrains Mono. Interactive demos in Observable JS, client-side. Source on GitHub; pull requests and corrections welcome.
Sister sites: Inference Lab (applied causal inference), Macro Prep (intermediate macroeconomics with live FRED data), and the main hub (research, datasets, writing).