Occam’s Razor
The simplest model that fits the data is also the most plausible .
Simple Model
simple hypothesis $h$ : small $\Omega(h)$ , specified by few parameters .
simple model $H$ : small $\Omega(H)$ , contains small number of hypotheses .
small $\Omega(h)\Leftarrow small\ \Omega(H)$
simple : small hypothesis/model complexity
Simple is Better
in addition to math proof that you have seen, philosophically :
simple $H$
$\Rightarrow\ smaller\ m_H(N)$
$\Rightarrow\ less\ ‘likely’\ to\ fit\ data\ perfectly\ \frac{m_H(N)}{2^N}$
$\Rightarrow\ more\ significant\ when\ fit\ happens$
Sampling Bias
If the data is sampled in a biased way , learning will produce a similarly biased outcome .
—> match test scenario as much as possible .
Data Snooping
If a data set has affected any step in the learning process , its ability to assess the outcome has been compromised .
reserve validation and use cautiously .
avoid making modeling decision by data .
interpret research results by proper feeling of contamination .
careful balance between data-driven modeling (snooping) and validation (no-snooping) .
Power of Three
Relatives / Bounds / Models / Tools / Principles
More Transform / More Regularization / Less Label