CNN飛機識別-基於paddlepaddle高級API（附詳細代碼講解）

前言

在使用paddlepaddle實戰完手寫數字集識別後，開始了新的一輪實戰-飛機識別。

與之前不同的是，這次的項目使用更加高級的框架api接口，代碼集成度更高，但是同樣的也出現了一個問題，課程中並沒有對這些api進行講解。

我花了兩個多小時，通過查看源代碼，終於搞清楚了這些api。下面，我們分享給大家！

文章目錄

CNN飛機識別網絡結構

數據集以及項目說明

本項目共有7897張圖像，其中訓練集5897張，測試集2000張圖像，每幅圖像的大小是32*32，形狀是(32,32,3)

本次項目是百度官方視頻七天入門深度學習中的 day2.
課程鏈接：Day 2 實戰——飛機識別

代碼解析

1. 加載數據

testdata_orgin = np.load('data/plane/testdata.npy')
testlabel_orgin=np.load('data/plane/testlabel.npy')
traindata_orgin = np.load('data/plane/traindata.npy')
trainlabel_orgin=np.load('data/plane/trainlabel.npy')

返回的數組形狀：
testdata_orgin.shape==(2000, 32, 32, 3)
testlabel_orgin==(2000, 1)
traindata_orgin==(5897, 32, 32, 3)
trainlabel_orgin==(5897, 1)
2. 轉換數據格式

testdata = np.array(testdata_orgin).reshape(2000,3,32,32).astype(np.float32)
traindata = np.array(traindata_orgin).reshape(5897,3,32,32).astype(np.float32)
testlabel = np.array(testlabel_orgin).reshape(2000,1).astype(np.float32)
trainlabel_orgin = np.array(trainlabel_orgin).reshape(5897,1).astype(np.float32)

由於在模型中，要求形狀爲(通道,寬,高),所以，這裏需要對原數據進行格式上的轉換。

3. 數據歸一化

testdata = 2*testdata/255.0-1.0
traindata = 2*traindata/255.0-1.0

圖片歸一化，使圖片中的每一個數在區間[-1,1]之間。這是在我們進行圖象分類任務必須進行的操作，有利於提高我們程序執行速度和模型精準度。

4. 構造數據生成器

def dataset(data,label,buf_size):#這裏bufsize是產生數據的個數
    def reader():
        for i in range(buf_size):
            yield data[i,:],int(label[i])
    return reader

這裏通過yield這個生成器函數，將dataset函數變成一個迭代器。迭代器獲取數據的好處就是節約內存。

詳情大家參考這篇文章：
今天終於弄明白了python迭代器是什麼（含paddlepaddle部分源碼解析）

5. 構造卷積網絡

def CNN():
    img = fluid.layers.data(
        name='img', shape =[3,32,32],dtype = 'float32')
    hidden = fluid.nets.simple_img_conv_pool(
        input=img,
        num_filters=6,#卷積核個數
        filter_size=5,#卷積核大小
        pool_size=2,#池化層大小
        pool_stride=2,#步長
        pool_padding=0,
        act='relu'
    )
    hidden1 = fluid.nets.simple_img_conv_pool(
        input=hidden,
        num_filters=16,
        filter_size=5,
        pool_size=2,
        pool_stride=2,
        pool_padding=0,
        act='relu'
    )
    flatten = fluid.layers.fc(input=hidden1,size=120,act='softmax')
    y_prediction = fluid.layers.fc(input=flatten,size=2,act='softmax')
    return y_prediction

在文章開頭我就把所使用的卷積神將網絡結構圖就展示了出來，這段代碼就是按照網絡結構進行搭建，最終返回值是一個形狀爲(1,2)的數組。

6. 建立整個網絡結構

def train_func():
    y_label = fluid.layers.data(name='label',shape=[1],dtype='int64')
    #計算損失值
    prediction = CNN()
    #交叉熵計算損失
    cost = fluid.layers.cross_entropy(input=prediction,label=y_label)
    avg_cost = fluid.layers.mean(cost)
    return avg_cost

在這個網絡結構中，我們通過調取CNN（）獲取預測值，再通過交叉熵計算誤差損失，最後通過mean()計算出平均誤差

7. 構造優化器

def optimizer_func():
    #創建優化器
    optimizer = fluid.optimizer.Momentum(learning_rate=0.001,momentum=0.5)
    return optimizer

這裏使用的是Momentum優化器，參數1是學習率，參數2是動量因子，我們也可以選擇其他的優化器，比如adam或者SDG(隨機梯度下降)。

8. 訓練過程損失值的可視化和參數保存

params_dirname="model/plane-model"
train_title="Train cost"
test_title="Test cost"
plot_cost = Ploter(train_title,test_title)#添加標題，這裏添加兩個標題，實際上就是立了兩個flag 用於後面數據添加。
step = 0
def event_handler_plot(event):
    global step#將step聲明爲一個全局變量使用。
    if isinstance(event, EndStepEvent):#判斷對象類型
        if event.step % 2 == 0: # 若干個batch,記錄cost
            
            if event.metrics[0] < 10:#這裏只記錄損失值在10一下的情況
            #   添加數據，第一個參數爲flag，向train_title添加數據，step爲x軸座標值，第三個參數是y軸座標值
                plot_cost.append(train_title, step, event.metrics[0])
                
        if event.step % 20 == 0: # 若干個batch,記錄cost
            #此方法返回值是損失值
            test_metrics = trainer.test(
                reader=test_reader, feed_order=feed_order)
            if test_metrics[0] < 10:
              
                plot_cost.append(test_title, step, test_metrics[0])
                plot_cost.plot()

        # 將參數存儲，用於預測使用
        if params_dirname is not None:
            trainer.save_params(params_dirname)
    step += 1

這個方法會在notebook上顯示一個動態變換的圖像，能讓我們更加直觀的看到整個訓練過程。

同時，通過 trainer.save_params(params_dirname)，將參數保存在這裏路徑下。

9. 創建數據迭代器

#訓練所用到的具體數據
BATCH_SIZE=16
train_reader = paddle.batch(
    paddle.reader.shuffle(dataset(traindata,trainlabel_orgin,buf_size=209),buf_size=50),
    batch_size=BATCH_SIZE
)
test_reader = paddle.batch(
    paddle.reader.shuffle(dataset(testdata,testlabel_orgin,buf_size=50),buf_size=20),
    batch_size=BATCH_SIZE
)

10. 創建訓練器，開始訓練

#執行環境
place =  fluid.CPUPlace()
#創建訓練器
trainer = Trainer(
    train_func = train_func,#必須是能夠返回損失值的函數
    place=place,#執行環境
    optimizer_func=optimizer_func#優化器，返回值是優化器類型
)
#開始訓練
trainer.train(#訓練函數，
    reader=train_reader,#訓練數據
    num_epochs=30,#訓練次數，每次訓練會處理數據讀取器中的所有數據。
    event_handler=event_handler_plot,#事件處理函數
    feed_order=feed_order
)

11. 創建測試器

inferencer = Inferencer(
    infer_func =CNN ,#能夠返回預測值的方法
    param_path=params_dirname,#之前經過保存的參數的路徑
    place=place#運行環境
)

12. 測試模型

def right_ratio(right_counter,total):
    ratio = float(right_counter)/total
    return ratio
def evl(data_set):
    total=0
    right_counter=0
    pass_num=0
    for mini_batch in data_set():
        pass_num+=1
        test_x = np.array([data[0] for data in mini_batch]).astype("float32")
        test_y = np.array([data[1] for data in mini_batch]).astype("int64")
        mini_batch_result = inferencer.infer({'img':test_x})
        mini_batch_result = (mini_batch_result[0][:,-1]>0.5)+0

        label = np.array(test_y)
        label_len = len(label)
        total+=label_len
        for i in range(label_len):
            if mini_batch_result[i]==label[i]:
                right_counter+=1
    radio = right_ratio(right_counter,total)
    return radio
radio = evl(train_reader)
print("訓練數據的正確率 %0.2f%%" %(radio*100))
radio = evl(test_reader)
print("預測數據的正確率 %0.2f%%" %(radio*100))

高級API講解

引用高級API

from paddle.utils.plot import Ploter
from paddle.fluid.contrib.trainer import EndStepEvent
from  paddle.fluid.contrib.trainer import Trainer
from paddle.fluid.contrib.inferencer import Inferencer

1. Ploter()繪畫
Ploter(train_title,test_title)，參數是數據標題，用於後面添加數據時做區分標誌的，同時也是顯示圖片圖例時label值。

.append(train_title, step, event.metrics[0]) 參數1是所要添加數據的標誌，第二個參數是x軸的值，第三個參數是y軸的值。例子中使用步數step作爲x的值，損失值 event.metrics[0]作爲y的值。
.plot_cost.plot() 顯示圖像。注意，這個才本地pycharm環境中不顯示，在notebook這種環境中顯示。其參數是路徑，若入路徑，將會保存圖像到該路徑。

2. Trainer（）訓練器

Trainer(
train_func = train_func,#必須是能夠返回損失值的函數
place=place,#執行環境
optimizer_func=optimizer_func#優化器，返回值是優化器類型
)

參數1 是我們構造的網絡結構，這個參數的要求是該方法的返回值必須是損失值。

參數2 執行環境place，也就是我們設置使用GPU執行還是CPU執行

參數3 優化器

.train( #訓練函數，
reader=train_reader,#訓練數據
num_epochs=30,#訓練次數，每次訓練會處理數據讀取器中的所有數據。
event_handler=event_handler_plot,#事件處理函數
feed_order=feed_order
)

這個是Trainer()的一個訓練方法。

參數1 是需要訓練的數據，類型是迭代器，這裏我們前面構造好了。

參數2 訓練次數，也就是大循環次數

參數3 事件處理函數，這裏使用的是event_handler_plot 顯示訓練過程的方法，但是這個方法必須帶有一個參數，我這裏使用event。通過判斷event的類型，我們可以判斷訓練的進度。比如有類型是BeginStepEvent開始和EndStepEvent結束，每一個類型返回的參數都是不一樣的。

參數4 一個列表，裏面是變量賦值的標籤。這裏是feed_order=[‘img’,‘label’]，一個是圖片輸入變量的name值，另一個是圖片標籤變量的name值。

3. Inferencer（）測試器

說白了就是拿着訓練好的模型和測試數據進行對模型的測試，返回值是預測值。也就是CNN()的返回值。

Inferencer(
infer_func =CNN ,#能夠返回預測值的方法
param_path=params_dirname,#之前經過保存的參數的路徑
place=place#運行環境
)

參數1 方法，一個能夠返回預測值的方法。
參數2 參數保存路徑（喫現成的飯😆）
參數3 運行環境

.inferencer.infer({‘img’:test_x})
參數必須是字典形式，key是輸入變量的name值，value是需要輸入的數據。
返回值是預測值預測值形狀是(batch_size,2)

總結

在之前的實戰中，使用的都是paddelpaddle框架比較底層的東西。這次通過自學，學習到了許多高層API的用法，爲以後實戰更加複雜的項目打下基礎。也希望金鉤能夠多多與大家分享學習知識，更希望大家給予一定的鼓勵(悄悄暗示👍)

CNN飛機識別-基於paddlepaddle高級API（附詳細代碼講解）

前言

文章目錄

CNN飛機識別網絡結構

數據集以及項目說明

代碼解析

高級API講解

總結

參考資料

嶺迴歸算法的原理和代碼實戰

我，又要開始讀書了！

線性迴歸算法擬合數據原理分析以及源代碼解析

局部加權線性迴歸算法原理分析和代碼實戰

基於單層決策樹的adaBoost算法思想分析和源代碼解析

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結