What is Machine Learning

data ———-> ML ———-> improved performance measure
machine learning: improving some performance measure with experience computed from data.
e.g. Tree Recognition

Key Essence of ML:

exists some ‘underlying pattern’ to be learned, so ‘performance measure’ can be improved.
but no programmable (easy) definition, so ‘ML’ is needed.
somehow there is data about the pattern, so ML has some ‘inputs’ to learn from.

Components of Learning: Metaphor Using Credit Approval

Applicant Information

unknown pattern to be learned: ‘approve credit card good for bank?’

Formalize the Learning Problem:

Basic Notations:

input: $x \epsilon X$ (customer application)

output: $y \epsilon Y$ (good/bad after approving credit card)

unknown pattern to be learned $\Leftrightarrow$ target function: $f: X\rightarrow Y$ (ideal credit approval formula)

data $\Leftrightarrow$ training examples: $D = {(x_1,y_1),(x_2,y_2),…,(x_N,y_N)}$ (historical records in bank)

hypothesis $\Leftrightarrow$ skill with hopefully good performance: $g: X\rightarrow Y$ (‘learned’ formula to be used)

$\{(X_n, Y_n)\}$ from $f$ —–> ML —–> g

Practical Definition of Machine Learning

machine learning: use data to compute hypothesis $g$ that approximates target $f$.
$A$ takes $D$ and $H$ to get $g$.

Machine Learning and Data Mining / Artificial Intelligence / Statistics

data mining: use (huge) data to find property that is interesting.
artificial intelligence: compute something that shows intelligent behavior.
statistics: use data to make inference about an unkonwn process.