Notes for NLP
RNN:https://zhuanlan.zhihu.com/p/28054589
DTW:https://blog.csdn.net/niyanghuahao/article/details/78612157
GMM HMM:https://blog.csdn.net/abcjennifer/article/details/27346787
LSTM:https://www.jianshu.com/p/9dc9f41f0b29
CTC:https://www.cnblogs.com/qcloud1001/p/9041218.html
Paper:The Unreasonable Effectiveness of Recurrent Neural Networks
An Intuitive Explanation of CNN CNN的直觀解釋
General Method of NN Prediction
使NN區別於簡單的線性組合的原因:非線性激活函數
Step1 Random Initialization
Initialize the weight and bias to some random nuber
Step2 Optimization of Cost function
Input all the training data, and calculate the optimal weight and bias by using gradient descent towards the cost function J.
cautions:
L denotes the loss function. m denotes the number of traning data.
Step3 Prediction
Predict using validation set.
Logistic and Softmax regression
Logistic Regression->binary classification:
As for linear dvision, the decision boundary E(X) can be written as:
then the we have:
E(X)>0-->class A, E(X)<0-->class B
while W is learned weight paramter, X is input feature and data, h(X) is the hypothesis function (eg: the prob of class A, and the prob of B is 1-h(X))
Softmax Regression->multiple classification:
Let z^[L] denote the input of the last softmax layer which also means the output of its previous layer.
According to the example above, assume 4 classes in total.
We use exponential activation function , and output the result using normalization, which means:
When using "Hardmax", the maximum of each dim of output will be set to 1 and others 0, and that's the original of the name "Softmax"
Loss function of Softmax
Obviously, the losses grow smaller if y^hat get bigger.
CNN的基本結構
Input->Conv and pooling*n1->Fc*n2->(softmax) classifier (probabilities of n classes)
滑動窗口算法的實現
eg:trained CNN (eg: for layer 1 with shape 14*14*3 which stands for the size of input training data), trained classes: pedestrain, bicycle,vehicle...
input image for object detection with sliding window shaping 14*14*3 and stride = 2 into trained CNN
it shapes 16(=14+2)*16*3
the result matrix shaping 2*2*4 denotes the probs of each class in the divided input image(16*16*3)
like: pedestrian bicyle vehicle bird (dims)
[[0.01 0.01 0.4 0]
[0.01 0.01 0.3 0]
[0.01 0.01 0.8 0]
[0.01 0.01 0.2 0]]