Occam’s Razor

The simplest model that fits the data is also the most plausible .

Simple Model
simple hypothesis $h$ : small $\Omega(h)$ , specified by few parameters .
simple model $H$ : small $\Omega(H)$ , contains small number of hypotheses .
small $\Omega(h)\Leftarrow small\ \Omega(H)$
simple : small hypothesis/model complexity

Simple is Better
in addition to math proof that you have seen, philosophically :
simple $H$
$\Rightarrow\ smaller\ m_H(N)$
$\Rightarrow\ less\ ‘likely’\ to\ fit\ data\ perfectly\ \frac{m_H(N)}{2^N}$
$\Rightarrow\ more\ significant\ when\ fit\ happens$

Sampling Bias

If the data is sampled in a biased way , learning will produce a similarly biased outcome .
—> match test scenario as much as possible .

Data Snooping

If a data set has affected any step in the learning process , its ability to assess the outcome has been compromised .
reserve validation and use cautiously .
avoid making modeling decision by data .
interpret research results by proper feeling of contamination .
careful balance between data-driven modeling (snooping) and validation (no-snooping) .

Power of Three

Relatives / Bounds / Models / Tools / Principles
More Transform / More Regularization / Less Label