01-When-Can-Machines-Learn(The-Learning-Problem)

What is Machine Learning

data ———-> ML ———-> improved performance measure
machine learning: improving some performance measure with experience computed from data.
e.g. Tree Recognition

Key Essence of ML:

  • exists some ‘underlying pattern’ to be learned, so ‘performance measure’ can be improved.
  • but no programmable (easy) definition, so ‘ML’ is needed.
  • somehow there is data about the pattern, so ML has some ‘inputs’ to learn from.

Components of Learning: Metaphor Using Credit Approval

Applicant Information

unknown pattern to be learned: ‘approve credit card good for bank?’


Formalize the Learning Problem:

Basic Notations:

  • input: $x \epsilon X$ (customer application)
  • output: $y \epsilon Y$ (good/bad after approving credit card)
  • unknown pattern to be learned $\Leftrightarrow$ target function: $f: X\rightarrow Y$ (ideal credit approval formula)
  • data $\Leftrightarrow$ training examples: $D = {(x_1,y_1),(x_2,y_2),…,(x_N,y_N)}$ (historical records in bank)
  • hypothesis $\Leftrightarrow$ skill with hopefully good performance: $g: X\rightarrow Y$ (‘learned’ formula to be used)

$\{(X_n, Y_n)\}$ from $f$ —–> ML —–> g


Practical Definition of Machine Learning

machine learning: use data to compute hypothesis $g$ that approximates target $f$.
$A$ takes $D$ and $H$ to get $g$.


Machine Learning and Data Mining / Artificial Intelligence / Statistics

  • data mining: use (huge) data to find property that is interesting.
  • artificial intelligence: compute something that shows intelligent behavior.
  • statistics: use data to make inference about an unkonwn process.