Translator

It has been a while since I last maintained this page. Use at your own peril.

Here is a machine learning to statistics translator I try to maintain. Most concepts are not really identical, so “translations” should be considered approximate. Based on the same idea in “All Of Statistics” by Larry Wasserman [p.15]. Many thanks to Ohad Shamir, and Saharon Rosset for helping populate the table. Despite my best efforts, it probably still contains errors. Drop me a note if you find any.

Concept	Statistics	Machine Learning
assumed model	parameter space	hypothesis class
	model	hypothesis
	misspecified model	agnostic learning
	deterministic outcome	realizable learning
	sampling distribution	generative model
output range	–	improper learning
model types	CART	decision tree, axis parallel rectangles
	piecewise constant function	decision list
	Bayes net	directed acyclic graph (of conditional probabilities)
	latent variable model	model based collaborative filtering
	neighborhood methods	memory based collaborative filtering
	multivariate distribution	graphical model
	–	Boolean circuit
	–	kCNF
	–	kDNF
	–	k-clause CNF
	–	k-term DNF
	–	CNF
	–	DNF
	–	Boolean formula
	–	Boolean threshold function
	–	Boolean circuit
	–	threshold circuit
	–	acyclic finite automata
tasks / problem setup	estimation	learning
	classification	supervised learning
	clustering	unsupervised learning
	–	transductive learning
	frequentist inference	–
	–	semi supervised learning
	support estimation	manifold learning
	hypothesis	–
	fixed design	conditional model, discriminative model
	random design	generative model
	adaptive design of experiments	active learning
	MANOVA, vector regression	structured learning
	basis augmentation	feature creation
	missing data imputation	collaborative filtering
	statistical process control	semi supervised novelty detection
data	data, sample, observations	examples, training sample, instances
	–	validation sample
	–	test sample
	covariates, design, \(X\)-matrix	features, attributes
methods	M-estimation	empirical risk minimization
	R-estimation	–
	L-Estimation	–
	moment matching	–
	quantile matching	–
	U-estimation, V-estimation	generative unsupervised RKHS learning
	K-estimation	–
	Fisher’s LDA (assuming independence)	Gaussian naive Bayes
interval methods	confidence intervals	PAC learnable
	credible interval	PAC Bayes learnable
	fiducial interval	–
	prediction interval	–
error decomposition	misspecification error	approximation error
	risk	estimation error, expected prediction error, test Error
	–	optimization Error
	optimism	test error-training Error
	RSS	empirical risk, training error
	Jackknife	hypothesis stability
	model selection	structural learning
problem complexity measures	generalized degrees of freedom	Rademacher complexity
	sample size	sample complexity