吳恩達機器學習系列作業目錄

1 多類分類(多個logistic迴歸)

我們將擴展我們在練習2中寫的logistic迴歸的實現，並將其應用於一對多的分類（不止兩個類別）。

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from scipy.io import loadmat

Dataset

首先，加載數據集。這裏的數據爲MATLAB的格式，所以要使用SciPy.io的loadmat函數。

def load_data(path):
    data = loadmat(path)
    X = data['X']
    y = data['y']
    return X,y

X, y = load_data('ex3data1.mat')
print(np.unique(y))  # 看下有幾類標籤
# [ 1  2  3  4  5  6  7  8  9 10]
X.shape, y.shape
# ((5000, 400), (5000, 1))

其中有5000個訓練樣本，每個樣本是20*20像素的數字的灰度圖像。每個像素代表一個浮點數，表示該位置的灰度強度。20×20的像素網格被展開成一個400維的向量。在我們的數據矩陣X中，每一個樣本都變成了一行，這給了我們一個5000×400矩陣X，每一行都是一個手寫數字圖像的訓練樣本。

第一個任務是將我們的邏輯迴歸實現修改爲完全向量化（即沒有“for”循環）。這是因爲向量化代碼除了簡潔外，還能夠利用線性代數優化，並且通常比迭代代碼快得多。

1.2 Visualizing the data

def plot_an_image(X):
    """
    隨機打印一個數字
    """
    pick_one = np.random.randint(0, 5000)
    image = X[pick_one, :]
    fig, ax = plt.subplots(figsize=(1, 1))
    ax.matshow(image.reshape((20, 20)), cmap='gray_r')
    plt.xticks([])  # 去除刻度，美觀
    plt.yticks([])
    plt.show()
    print('this should be {}'.format(y[pick_one]))

def plot_100_image(X):
    """ 
    隨機畫100個數字
    """
    sample_idx = np.random.choice(np.arange(X.shape[0]), 100)  # 隨機選100個樣本
    sample_images = X[sample_idx, :]  # (100,400)
    
    fig, ax_array = plt.subplots(nrows=10, ncols=10, sharey=True, sharex=True, figsize=(8, 8))

    for row in range(10):
        for column in range(10):
            ax_array[row, column].matshow(sample_images[10 * row + column].reshape((20, 20)),
                                   cmap='gray_r')
    plt.xticks([])
    plt.yticks([])        
    plt.show()

1.3 Vectorizing Logistic Regression

我們將使用多個one-vs-all(一對多)logistic迴歸模型來構建一個多類分類器。由於有10個類，需要訓練10個獨立的分類器。爲了提高訓練效率，重要的是向量化。在本節中，我們將實現一個不使用任何for循環的向量化的logistic迴歸版本。

首先準備下數據。

1.3.1 Vectorizing the cost function

首先寫出向量化的代價函數。回想正則化的logistic迴歸的代價函數是：
$J\left( \theta \right)=\frac{1}{m}\sum\limits_{i=1}^{m}{[-{{y}^{(i)}}\log \left( {{h}_{\theta }}\left( {{x}^{(i)}} \right) \right)-\left( 1-{{y}^{(i)}} \right)\log \left( 1-{{h}_{\theta }}\left( {{x}^{(i)}} \right) \right)]}+\frac{\lambda}{2m}\sum^n_{j=1}\theta^2_j$

首先我們對每個樣本 $i$ 要計算 $h_{\theta}(x^{(i)})$ ， $h_{\theta}(x^{(i)})=g(\theta^Tx^{(i)})$ ， $g(z)=\frac{1}{1+e^{-z}}$ sigmoid函數。

事實上我們可以對所有的樣本用矩陣乘法來快速的計算。讓我們如下來定義 $X$ 和 $\theta$ ：

然後通過計算矩陣積 $X\theta$ ，我們可以得到：

在最後一個等式中，我們用到了一個定理，如果 $a$ 和 $b$ 都是向量，那麼 $a^Tb=b^Ta$ ，這樣我們就可以用一行代碼計算出所有的樣本。

def sigmoid(z):
    return 1 / (1 + np.exp(-z))

def regularized_cost(theta, X, y, l):
    """
    don't penalize theta_0
    args:
        X: feature matrix, (m, n+1) # 插入了x0=1
        y: target vector, (m, )
        l: lambda constant for regularization
    """
    thetaReg = theta[1:]
    first = (-y*np.log(sigmoid(X@theta))) + (y-1)*np.log(1-sigmoid(X@theta))
    reg = (thetaReg@thetaReg)*l / (2*len(X))
    return np.mean(first) + reg

1.3.2 Vectorizing the gradient

回顧正則化logistic迴歸代價函數的梯度下降法如下表示，因爲不懲罰theta_0，所以分爲兩種情況：

所以其中的梯度表示如下：

def regularized_gradient(theta, X, y, l):
    """
    don't penalize theta_0
    args:
        l: lambda constant
    return:
        a vector of gradient
    """
    thetaReg = theta[1:]
    first = (1 / len(X)) * X.T @ (sigmoid(X @ theta) - y)
    # 這裏人爲插入一維0，使得對theta_0不懲罰，方便計算
    reg = np.concatenate([np.array([0]), (l / len(X)) * thetaReg])
    return first + reg

1.4 One-vs-all Classification

這部分我們將實現一對多分類通過訓練多個正則化logistic迴歸分類器，每個對應數據集中K類中的一個。

對於這個任務，我們有10個可能的類，並且由於logistic迴歸只能一次在2個類之間進行分類，每個分類器在“類別 i”和“不是 i”之間決定。我們將把分類器訓練包含在一個函數中，該函數計算10個分類器中的每個分類器的最終權重，並將權重返回shape爲(k, (n+1))數組，其中 n 是參數數量。

from scipy.optimize import minimize

def one_vs_all(X, y, l, K):
    """generalized logistic regression
    args:
        X: feature matrix, (m, n+1) # with incercept x0=1
        y: target vector, (m, )
        l: lambda constant for regularization
        K: numbel of labels
    return: trained parameters
    """
    all_theta = np.zeros((K, X.shape[1]))  # (10, 401)
    
    for i in range(1, K+1):
        theta = np.zeros(X.shape[1])
        y_i = np.array([1 if label == i else 0 for label in y])
    
        ret = minimize(fun=regularized_cost, x0=theta, args=(X, y_i, l), method='TNC',
                        jac=regularized_gradient, options={'disp': True})
        all_theta[i-1,:] = ret.x
                         
    return all_theta

這裏需要注意的幾點：首先，我們爲X添加了一列常數項 1 ，以計算截距項（常數項）。其次，我們將y從類標籤轉換爲每個分類器的二進制值（要麼是類i，要麼不是類i）。最後，我們使用SciPy的較新優化API來最小化每個分類器的代價函數。如果指定的話，API將採用目標函數，初始參數集，優化方法和jacobian（漸變）函數。然後將優化程序找到的參數分配給參數數組。

實現向量化代碼的一個更具挑戰性的部分是正確地寫入所有的矩陣，保證維度正確。

def predict_all(X, all_theta):
    # compute the class probability for each class on each training instance   
    h = sigmoid(X @ all_theta.T)  # 注意的這裏的all_theta需要轉置
    # create array of the index with the maximum probability
    # Returns the indices of the maximum values along an axis.
    h_argmax = np.argmax(h, axis=1)
    # because our array was zero-indexed we need to add one for the true label prediction
    h_argmax = h_argmax + 1
    
    return h_argmax

這裏的h共5000行，10列，每行代表一個樣本，每列是預測對應數字的概率。我們取概率最大對應的index加1就是我們分類器最終預測出來的類別。返回的h_argmax是一個array，包含5000個樣本對應的預測值。

raw_X, raw_y = load_data('ex3data1.mat')
X = np.insert(raw_X, 0, 1, axis=1) # (5000, 401)
y = raw_y.flatten()  # 這裏消除了一個維度，方便後面的計算 or .reshape(-1) （5000，）

all_theta = one_vs_all(X, y, 1, 10)
all_theta  # 每一行是一個分類器的一組參數

y_pred = predict_all(X, all_theta)
accuracy = np.mean(y_pred == y)
print ('accuracy = {0}%'.format(accuracy * 100))

Tips: python中 true就是1，1就是true，false就是0，0就是false

2 Neural Networks

上面使用了多類logistic迴歸，然而logistic迴歸不能形成更復雜的假設，因爲它只是一個線性分類器。

接下來我們用神經網絡來嘗試下，神經網絡可以實現非常複雜的非線性的模型。我們將利用已經訓練好了的權重進行預測。

def load_weight(path):
    data = loadmat(path)
    return data['Theta1'], data['Theta2']

theta1, theta2 = load_weight('ex3weights.mat')

theta1.shape, theta2.shape

因此在數據加載函數中，原始數據做了轉置，然而，轉置的數據與給定的參數不兼容，因爲這些參數是由原始數據訓練的。所以爲了應用給定的參數，我需要使用原始數據（不轉置）

X, y = load_data('ex3data1.mat')
y = y.flatten()
X = np.insert(X, 0, values=np.ones(X.shape[0]), axis=1)  # intercept

X.shape, y.shape

a1 = X
z2 = a1 @ theta1.T
z2.shape

z2 = np.insert(z2, 0, 1, axis=1)

a2 = sigmoid(z2)
a2.shape

z3 = a2 @ theta2.T
z3.shape

a3 = sigmoid(z3)
a3.shape

y_pred = np.argmax(a3, axis=1) + 1 
accuracy = np.mean(y_pred == y)
print ('accuracy = {0}%'.format(accuracy * 100))  # accuracy = 97.52%

雖然人工神經網絡是非常強大的模型，但訓練數據的準確性並不能完美預測實際數據，在這裏很容易過擬合。

吳恩達機器學習作業Python實現(三)：多類分類和前饋神經網絡

吳恩達機器學習系列作業目錄

1 多類分類(多個logistic迴歸)

Dataset

1.2 Visualizing the data

1.3 Vectorizing Logistic Regression

1.3.1 Vectorizing the cost function

1.3.2 Vectorizing the gradient

1.4 One-vs-all Classification

2 Neural Networks

BP（反向傳播）神經網絡

吳恩達機器學習作業Python實現(三)：多類分類和前饋神經網絡

hexo -d 部署的時候報錯 FATAL Something's wrong Template render error: expected variable

如何將自己的數據轉換爲Pascal voc2017數據集標註格式

Python的numpy基本用法

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結