Python_BP神經網絡實現（向量化運算、鳶尾花分類測試）

簡介

人工神經網絡模型種類很多，其中根據網絡內數據流向進行分類可以分爲前饋網絡、反饋網絡和自組織網絡。

通過對Andrew Ng的深度學習課程的學習，本文總結其中淺層神經網絡一章的知識點(本文重點不在於公式講解，而是算法的簡單實現，具體理論可看——深度學習工程師)。在此介紹和利用Python實現BP神經網絡，BP神經網絡是一種典型的前饋神經網絡。

結構

BP神經網絡分爲三層分別是輸入層、隱層和輸出層，其中隱層的層數可以擴展，且每一層的神經元個數也可以增減。每一層中神經元與前後層神經元相連接，但是同一層神經元之間無連接。可看下方示意圖。

原理

當我們使用BP神經網絡來對數據進行分類或者預測的時候，每對有連接的神經元之間都有一個權重，記爲w；同時還有偏移量，記爲b。每個神經元中還有一個激活函數，記爲σ（x），要注意的是這不是一個函數，有多個函數可以作爲激活函數：Sigmoid、tanh、Relu等。

在每一次迭代計算中，正向運算（輸入層開始），我們會計算出一個值，然後計算出該值與標準值的誤差；反向運算（輸出層開始），按照減小誤差方向，修正各連接權。通過一次次的迭代計算，直到誤差減小到給定的極小值，就可以結束迭代，完成訓練。

Python實現思路

通過python實現BP神經網絡，主要有以下幾個步驟：

神經網絡結構確定
權重和偏移量參數初始化
正向傳播計算
成本函數計算
反向傳播計算
權重和偏移量參數更新

神經網絡結構確定

該函數主要是爲了獲取輸入量x的矩陣大小，以及標籤y的矩陣大小。

def layer_size(X, Y):
"""
:param X: input dataset of shape (input size, number of examples)  (輸入數據集大小（幾個屬性，樣本量）)
:param Y: labels of shape (output size, number of exmaples) (標籤數據大小（標籤數，樣本量）)
:return: 
n_x: the size of the input layer
n_y: the size of the output layer
"""
n_x = X.shape[0]
n_y = Y.shape[0]

return (n_x, n_y)

權重和偏移量參數初始化

該函數主要是爲了初始化我們的連接權重w和偏移量b。要注意的是確保參數矩陣大小正確。

def initialize_parameters(n_x, n_h, n_y):
"""
initialize_parameters
(參數初始化)
:param n_x: size of the input layer 
:param n_h: size of the hidden layer
:param n_y: size of the output layer
:return: 
W1: weight matrix of shape (n_h, n_x) (第1層的權重矩陣(n_h, n_x))
b1: bias vector of shape (n_h, 1) (第1層的偏移量向量(n_h, 1))
W2: weight matrix of shape (n_y, n_h) (第2層的權重矩陣(n_y, n_h))
b2: bias vector of shape (n_y, 1) (第2層的偏移量向量(n_y, 1))
"""
# np.random.seed(2)  #Random initialization (隨機種子初始化參數)

W1 = np.random.randn(n_h, n_x) * 0.01
b1 = np.zeros((n_h, 1))
W2 = np.random.randn(n_y, n_h) * 0.01
b2 = np.zeros((n_y, 1))

parameters = {
    'W1': W1,
    'b1': b1,
    'W2': W2,
    'b2': b2,
}

return parameters

正向傳播計算

該函數爲正向傳播計算，需要注意的是，中間層的激活函數爲tanh，輸出層的激活函數爲sigmoid。

def forward_propagation(X, parameters):
"""
forward_propagation
(正向傳播)
:param X: input data of size (n_x, m)  (輸入數據集X)
:param parameters: python dictionary containing your parameters (output of initialization function) (字典類型， 權重以及偏移量參數)
:return: 
A2: The sigmoid output of the second activation (第2層激活函數sigmoid函數輸出向量)
cache: a dictionary containing "Z1", "A1", "Z2" and "A2" (字典類型,包含"Z1", "A1", "Z2", "A2")
"""
W1 = parameters['W1']
b1 = parameters['b1']
W2 = parameters['W2']
b2 = parameters['b2']

Z1 = np.dot(W1, X) + b1
A1 = np.tanh(Z1)            #第1層激活函數選擇tanh
Z2 = np.dot(W2, A1) + b2
A2 = sigmoid(Z2)            #第2層激活函數選擇sigmod


assert (A2.shape == (1, X.shape[1])) #若A2的大小和((1, X.shape[1]))不同，則直接報異常

cache = {
    'Z1': Z1,
    'A1': A1,
    'Z2': Z2,
    'A2': A2,
}

return A2, cache

成本函數計算

該函數主要是爲了計算成本函數，注意一個樣本的期望輸出和實際輸出的誤差的平方用來定義損失函數，在向量化的計算過程中，這裏使用了成本函數。詳細定義可見深度學習工程師。

def compute_cost(A2, Y, parameters):
"""
compute cost
(計算成本函數)
:param A2: The sigmoid output of the second activation, of shape (1, number of examples) (第2層激活函數sigmoid函數輸出向量)
:param Y: "true" labels vector of shape (1, number of examples) (正確標籤向量)
:param parameters: python dictionary containing your parameters W1, b1, W2 and b2 (字典類型，權重以及偏移量參數)
:return: 
cost: cross-entropy cost 
"""
m = Y.shape[1]  # number of example

W1 = parameters['W1']
W2 = parameters['W2']

logprobs = np.multiply(np.log(A2), Y)
cost = - np.sum(np.multiply(np.log(A2), Y) + np.multiply(np.log(1. - A2), 1. - Y)) / m
# cost = np.sum(Y * np.log(A2) + (1 - Y) * np.log(1 - A2))/(-m)

cost = np.squeeze(cost) #squeeze()函數的功能是：從矩陣shape中，去掉維度爲1的。例如一個矩陣是的shape是（5， 1），使用過這個函數後，結果爲（5，）。

assert (isinstance(cost, float)) #若cost不是float型 則直接報異常

return cost

反向傳播計算

該函數爲方向傳播計算。

def backward_propagation(parameters, cache, X, Y):
"""
backward propagation
(反向傳播)
:param parameters: python dictionary containing our parameters
:param cache: a dictionary containing "Z1", "A1", "Z2" and "A2"
:param X: input data of shape (2,number of examples)
:param Y: "ture" labels vector of shape (1, number of examples)
:return: 
grads: python dictionary containing your gradients with respect to different parameters (字典類型，梯度微分參數)
"""
m = X.shape[1]

W1 = parameters['W1']
W2 = parameters['W2']

A1 = cache['A1']
A2 = cache['A2']

dZ2 = A2 - Y
dW2 = np.dot(dZ2, A1.T) / m
db2 = np.sum(dZ2, axis=1, keepdims=True) / m
dZ1 = np.dot(W2.T, dZ2) * (1 - A1 ** 2)
dW1 = np.dot(dZ1, X.T) / m
db1 = np.sum(dZ1, axis=1, keepdims=True) / m

grads = {
    'dW1': dW1,
    'db1': db1,
    'dW2': dW2,
    'db2': db2,
}

return grads

權重和偏移量參數更新

該函數爲更新權重和偏移量參數。

def update_parameters(parameters, grads, learning_rate):
"""
update parameters
(更新權重和偏移量參數)
:param parameters: python dictionary containing your parameters
:param grads: python dictionary containing your gradients 
:param learning_rate (學習速率)
:return: 
:parameters:  python dictionary containing your updated parameters 
"""
W1 = parameters['W1']
b1 = parameters['b1']
W2 = parameters['W2']
b2 = parameters['b2']

dW1 = grads['dW1']
db1 = grads['db1']
dW2 = grads['dW2']
db2 = grads['db2']

W1 = W1 - learning_rate * dW1
b1 = b1 - learning_rate * db1
W2 = W2 - learning_rate * dW2
b2 = b2 - learning_rate * db2

parameters = {
    "W1": W1,
    "b1": b1,
    "W2": W2,
    "b2": b2,
}

return parameters

BP神經網絡

選擇我們將上面的幾個函數組合起來，就可以得到一個兩層的BP神經網絡模型。

def nn_model(X, Y, n_h, num_iterations, learning_rate, print_cost=False):
"""
Forward Neural Network model
(前向神經網絡模型)
:param X: input dataset of shape (input size, number of examples)  (輸入數據集大小（幾個屬性，樣本量）)
:param Y: labels of shape (output size, number of exmaples) (標籤數據大小（標籤數，樣本量）)
:param n_h: size of the hidden layer (隱層神經元數量)
:param num_iterations:  Number of iterations in gradient descent loop (迭代次數)
:param learning_rate (學習速率)
:param print_cost: if True, print the cost every 1000 iterations (是否打印顯示)
:return: 
parameters: parameters learnt by the model. They can then be used to predict (訓練完成後的參數)
"""

# np.random.seed(4)
n_x = layer_size(X, Y)[0]
n_y = layer_size(X, Y)[1]

parameters = initialize_parameters(n_x, n_h, n_y)
W1 = parameters['W1']
b1 = parameters['b1']
W2 = parameters['W2']
b2 = parameters['b2']

cost_list = []
for i in range(0, num_iterations):

    A2, cache = forward_propagation(X, parameters)

    cost = compute_cost(A2, Y, parameters)

    cost_list.append(cost)

    grads = backward_propagation(parameters, cache, X, Y)

    parameters = update_parameters(parameters, grads, learning_rate)

    if print_cost and i % 1000 == 0:
        print("Cost after iteration %i: %f" % (i, cost))

return parameters, cost_list

鳶尾花分類測試

既然已經完成了這個BP神經網絡模型，現在就可以來試試效果。在這裏我們用鳶尾花的分類來檢測這個模型的可用性。

簡介

Iris數據集是常用的分類實驗數據集，由Fisher, 1936收集整理。Iris也稱鳶尾花卉數據集，是一類多重變量分析的數據集。數據集包含150個數據集，分爲3類，每類50個數據，每個數據包含4個屬性。可通過花萼長度，花萼寬度，花瓣長度，花瓣寬度4個屬性預測鳶尾花卉屬於（Setosa，Versicolour，Virginica）三個種類中的哪一類。

屬性：

Sepal.Length（花萼長度），單位是cm;
Sepal.Width（花萼寬度），單位是cm;
Petal.Length（花瓣長度），單位是cm;
Petal.Width（花瓣寬度），單位是cm;

種類：

Iris Setosa（山鳶尾）（本例中使用數字‘0’表示）
Iris Versicolour（雜色鳶尾）（本例中使用數字‘1’表示）
Iris Virginica（維吉尼亞鳶尾）（本例中使用數字‘2’表示）

鳶尾花數據下載

測試程序

#!/usr/bin/env python  
# _*_ coding:utf-8 _*_  
#  
# @Version : 1.0  
# @Time    : 2018/6/6  
# @Author  : 圈圈烴
# @File    : User_BPNN
import numpy as np
import matplotlib.pyplot as plt
from Forward_NeuralNetwork import *


def data_process():
    """Iris.txt數據預處理"""
    with open("iris.txt", 'r') as f:
        data = f.read()
        data = data.replace('Iris-setosa', '0,')
        data = data.replace('Iris-versicolor', '1,')
        data = data.replace('Iris-virginica', '2,')
    with open("iris1.txt", 'w') as fw:
        fw.write(data)
        fw.close()


def load_csv():
    """加載處理好存入csv格式的數據"""
    tmp = np.loadtxt("iris.csv",dtype=np.str, delimiter=",")
    data = tmp[0:, 0:4].astype(np.float)
    label = tmp[0:, 4].astype(np.float)
    label = label.reshape(150, 1)
    return data.T, label.T


def normalized(X):
    """
    :param X: 待歸一化的數據 
    :return: 
    X：歸一化後的數據
    """
    Xmin, Xmax = X.min(), X.max()
    XN = (X - Xmin) / (Xmax - Xmin)
    return XN


def main():

    X, Y = load_csv()
    X = normalized(X)
    Y = normalized(Y)
    """訓練集90個數據"""
    train_x = np.hstack((X[:, 0:30], X[:, 50:80], X[:, 100:130]))
    train_y = np.hstack((Y[:, 0:30], Y[:, 50:80], Y[:, 100:130]))
    """測試集60個數據"""
    test_x = np.hstack((X[:, 30:50], X[:, 80:100], X[:, 130:150]))
    test_y = np.hstack((Y[:, 30:50], Y[:, 80:100], Y[:, 130:150]))
    """訓練，中間層10個神經元，迭代10000次，學習率0.25"""
    n_h = 10
    parameter, cost_list = nn_model(train_x, train_y, n_h, num_iterations=10000, learning_rate=0.25, print_cost=True)
    """測試，代入測試集數據"""
    A2, cache = forward_propagation(test_x, parameters=parameter)
    TY = A2
    TY[TY > 0.8] = 1
    TY[TY < 0.2] = 0
    TY[(TY >= 0.2) & (TY <= 0.8)] = 0.5
    # print(A2,TY)
    count = 0
    for i in range(0, 60):
        if TY[0, i] == test_y[0, i]:
            count += 1
    print("準確率爲：%f %%" %(100*count/60))
    """繪製梯度下降曲線"""
    plt.plot(cost_list)
    plt.show()


if __name__ == '__main__':
    main()

測試結果

測試中，將150個數劃分成了90個訓練數據，60個測試數據。神經網絡的中間層爲10個神經元，迭代次數爲10000次，學習率爲0.25。在訓練和測試中，需要對數據進行歸一化，其中包括對標籤數據Y的歸一化，原來，我設置的三類鳶尾花的標籤分別是0，1，2。通過歸一化之後，獲得的標籤數據爲0，0.5，1。對測試集獲得的結果，進行歸檔，小於0.2的爲0，大於0.8的爲1，其餘的均爲0.5。最終獲得的分類結果的準確率爲98.3%。

Cost after iteration 0: 0.693152
Cost after iteration 1000: 0.280715
Cost after iteration 2000: 0.275627
Cost after iteration 3000: 0.274676
Cost after iteration 4000: 0.274162
Cost after iteration 5000: 0.273742
Cost after iteration 6000: 0.273368
Cost after iteration 7000: 0.273018
Cost after iteration 8000: 0.272678
Cost after iteration 9000: 0.272336
準確率爲：98.333333 %

寫在最後

完整程序，歡迎下載

還有不足，歡迎交流。

Python BP神經網絡實現