09-Linear-Regression

Veröffentlicht am 2017-07-18

Linear Regresssion Algorithm

use squared error
$E_{in}(w)=\frac{1}{N}\sum_{n=1}^N(w^Tx_n-y_n)^2=\frac{1}{N}\sum_{n=1}^N(x_n^Tw-y_n)^2$
$w\ is\ (d+1)\times1,\ x_n\ is\ (d+1)\times1$
$\Leftrightarrow\frac{1}{N}\left | Xw-Y \right |^2\ with\ X(N,d+1),w(d+1,1),y(N,1)$

$\mathop{min}\limits_wE_{in}(w)=\frac{1}{N}\left | Xw-Y \right |^2$
$E_{in}(w)$ : continuous, differentiable, convex —> necessary condition of ‘best’ w
TASK: find $w_{LIN}$ such that $\bigtriangledown E_{in}(w_{LIN})=0$
$E_{in}(w)=\frac{1}{N}\left | Xw-Y \right |^2=\frac{1}{N}(w^TX^TXw-2w^TX^TY+Y^TY)$
$and\ \bigtriangledown E_{in}(w)=\frac{2}{N}(X^TXw-X^TY)$

07-08-VCDimension-and-Error

Veröffentlicht am 2017-07-18

VC Dimension

$d_{vc}=minimum\ k-1=maximum\ non-breakpoint=d+1$

Step1: prove $d_{vc}\geq d+1$ by shattering some d+1 inputs.
Step2: prove $d_{vc}\leq d+1$ by proving can not shatter any d+2 inputs.

05-06-From-M-to-m

Veröffentlicht am 2017-07-17

Trade-off on M

can we make sure that $E_{out}(g)$ is close enough to $E_{in}(g)$ ?
can we make $E_{out}(g)$ small enough?

$\mathbb{P}[BAD]\leq 2·M·exp(…)$
对于小的M来说，我们可以做到第一点，但是第二点做不到，因为选择太少了；
对于大的M来说，我们可以做到第二点，但是第一点做不到.

$\mathbb{P}[|E_{in}(g)-E_{out}(g)|>\epsilon]\leq 2·M·exp(-2\epsilon^2N)$

Reports

Veröffentlicht am 2017-07-17

7.7 机器阅读理解
7.17 redis rehash simhash
7.18 topic Model: LSA pLSA LDA BTM dBTM coherence.
负载均衡，随机算法，预设定数组{AABBABAABAAA},一致性哈希：平衡性，分散性；
7.27 《数据挖掘导论》《统计学习方法》《Theoretical computer science》《Game theory and mechansim design》

04-Feasibility-of-Learning

Veröffentlicht am 2017-07-16

Hoeffding’s Inequality

03-Types-of-Learning

Veröffentlicht am 2017-07-16

Learning with Different Output Space $Y$

binary classification: $Y=\{-1,+1\}$
multiclass classification: $Y=\{1,2,…,K\}$
regression: $Y=\mathbb{R}$
structured learning: $Y=structures$

02-Learning-to-Answer-Yes-or-No

Veröffentlicht am 2017-07-15

Perceptron Hypothesis Set

For $x=(x_1,x_2,…,x_d)$ ‘features of customer’, compute a weighted ‘score’ and:
$\quad \quad \quad$ approve credit if $\quad \sum_{i=1}^{d}{w_ix_i} > threshold$
$\quad \quad \quad \quad \ $ deny credit if $\quad \sum_{i=1}^{d}{w_ix_i} < threshold$
$y$: {+1(good), -1(bad)}, linear formula $h \epsilon H$ are:
$\quad \quad \quad$ $h(x)=sign((\sum_{i=1}^{d}{w_ix_i})-threshold)$
Vector form:
$
\begin{eqnarray}
h(x) &=& sign((\sum_{i=1}^{d}{w_ix_i})-threshold)\\
&=&sign((\sum_{i=1}^{d}{w_ix_i})+(-threshold)·(+1))\\
&=&sign(\sum_{i=0}^{d}{w_ix_i})\\
&=&sign(w^Tx)
\end{eqnarray}
$

01-When-Can-Machines-Learn(The-Learning-Problem)

Veröffentlicht am 2017-07-15

What is Machine Learning

data ———-> ML ———-> improved performance measure
machine learning: improving some performance measure with experience computed from data.
e.g. Tree Recognition