Stats Lab

Applied statistics from fundamentals through master’s-level. Excel, Stata, R, Julia, side by side.

Author

Dr. Ian Helfrich

Published

May 2026

Open teaching site · Excel · Stata · R · Julia · CC-BY-SA 4.0

Stats Lab

Applied statistics from fundamentals through master’s level. Twelve chapters covering probability, distributions, sampling theory, estimation, hypothesis testing, linear models, GLMs, nonparametrics, multivariate methods, Bayesian inference, and time series. Every concept paired with worked code in Excel first, then Stata, R, and Julia, in that order.

Dr. Ian Helfrich · PhD Economics, Georgia Tech 2024 · TA for Econ 7023 PhD Econometrics II

Stylized normal bell curve with a right-tail rejection region, a small sampling-distribution histogram, and a regression scatter, in the ink/rust/sage palette.

How to use this

The book is organized in five parts: Foundations, Inference, Linear Models, Modern Extensions, and Multivariate/Bayesian/Time Series. Within each chapter, the structure is the same: concept → math → worked example in Excel → Stata → R → Julia → common traps → reporting checklist → references.

If you’re learning statistics for the first time, read Chapters 1–6 in order. If you’re filling in gaps before a master’s exam, the diagnostics chapter (7) and the GLM chapter (8) are where most students lose ground, so start there. If you’re a practitioner who needs to brush up on a specific method, the chapter index in the sidebar is the fastest path.

The companion demos page has the browser-side interactives: a distribution sampler, a CLT animator, an OLS-by-hand visualizer, a bootstrap distribution generator, and more. No install required.

The lecture series

1 Foundations

Probability, distributions, and sampling theory. The bedrock that everything else assumes.

2 Inference

Estimation, intervals, and testing. The toolkit for turning a sample into a defensible claim about a population.

3 Linear models

OLS as geometry and as estimator. Diagnostics, robust standard errors, and the practical machinery.

4 Modern extensions

When the outcome is not continuous, the assumptions don’t hold, or the function is unknown.

5 Multivariate, Bayesian, time series

The methods most master’s curricula put in the second semester. PCA and clustering for high-dimensional data; Bayes for honest priors; ARMA for serially correlated outcomes.

D Interactive demos

Distribution samplers, CLT animators, an OLS-by-hand visualizer, bootstrap distributions, and more. Run them in your browser; no install.

Why Excel first, then Stata, R, and Julia

Most master’s students arrive comfortable in Excel, somewhat comfortable in one stats package, and rarely in two or more. The path through the four tools in this book reflects that. Excel teaches the concept (the formula is right there in the cell). Stata gives the production-grade pipeline most employers and journals expect. R is the modern lingua franca of academic statistics and the place most new methods land first. Julia is the speed and composability story for the future of the field. The chapter code is interleaved so you can use whichever tool fits your situation today, while the others are one click away.

Pedagogically, working a method four times in four languages is the single fastest way to internalize what’s mechanical, what’s conceptual, and what’s package-specific.

How this site is built

Quarto book with KaTeX math, copy-to-clipboard code, hand-built SCSS theme using Source Serif 4, Inter, and JetBrains Mono. Interactive demos in Observable JS, client-side. Source on GitHub; pull requests and corrections welcome.

Sister sites: Inference Lab (applied causal inference), Macro Prep (intermediate macroeconomics with live FRED data), and the main hub (research, datasets, writing).