What is Machine Learning
data ———-> ML ———-> improved performance measure
machine learning: improving some performance measure with experience computed from data.
e.g. Tree Recognition
Key Essence of ML:
- exists some ‘underlying pattern’ to be learned, so ‘performance measure’ can be improved.
- but no programmable (easy) definition, so ‘ML’ is needed.
- somehow there is data about the pattern, so ML has some ‘inputs’ to learn from.
Components of Learning: Metaphor Using Credit Approval
Applicant Information
unknown pattern to be learned: ‘approve credit card good for bank?’
Formalize the Learning Problem:
Basic Notations:
- input: $x \epsilon X$ (customer application)
- output: $y \epsilon Y$ (good/bad after approving credit card)
- unknown pattern to be learned $\Leftrightarrow$ target function: $f: X\rightarrow Y$ (ideal credit approval formula)
- data $\Leftrightarrow$ training examples: $D = {(x_1,y_1),(x_2,y_2),…,(x_N,y_N)}$ (historical records in bank)
- hypothesis $\Leftrightarrow$ skill with hopefully good performance: $g: X\rightarrow Y$ (‘learned’ formula to be used)
$\{(X_n, Y_n)\}$ from $f$ —–> ML —–> g
Practical Definition of Machine Learning
machine learning: use data to compute hypothesis $g$ that approximates target $f$.
$A$ takes $D$ and $H$ to get $g$.
Machine Learning and Data Mining / Artificial Intelligence / Statistics
- data mining: use (huge) data to find property that is interesting.
- artificial intelligence: compute something that shows intelligent behavior.
- statistics: use data to make inference about an unkonwn process.