PaddlePaddle|CV疫情特輯(二):手勢識別

PaddlePaddle|CV疫情特輯(二):手勢識別

本節內容來自:百度AIstudio課程
做一個記錄。

本節內容主要是搭建一個分類網絡,對手勢進行分類,在分類之前,首先看一下數據特徵:
在這裏插入圖片描述在這裏插入圖片描述在這裏插入圖片描述在這裏插入圖片描述在這裏插入圖片描述
在這裏插入圖片描述在這裏插入圖片描述在這裏插入圖片描述在這裏插入圖片描述在這裏插入圖片描述
從左到右,從上到下,按順序排列分別表示0-9,其實仔細看這個數據集是存在一定難度的,每張圖片的光照、角度都是不一致的。但比較有趣的好像都是右手。
本次程序均在本地調試,但還是儘量按照網上的步驟講解。

1.首先是引入包

# ResNet模型代碼
import numpy as np
import paddle
import paddle.fluid as fluid
from paddle.fluid.layer_helper import LayerHelper
from paddle.fluid.dygraph.nn import Conv2D, Pool2D, BatchNorm, Linear
from paddle.fluid.dygraph.base import to_variable

import os
import time
import random
import numpy as np
from PIL import Image
import matplotlib.pyplot as plt
import paddle
import paddle.fluid as fluid
import paddle.fluid.layers as layers
from multiprocessing import cpu_count
# from paddle.fluid.dygraph import Pool2D,Conv2D
# from paddle.fluid.dygraph import Linear

2. 引入Resnet網絡

關於Resnet不做過多介紹,這裏是引用官方實現
但是官方只有Resnet-50以上的Resnet,沒有Resnet-18,Resnet-18層數更小,更方便測試,如何修改是很簡單,首先參考Resnet網絡結構:
在這裏插入圖片描述
所以只需要添加一段:depth = [2, 2, 2, 2]即可。
由於未使用預訓練模型,同時在初始化時使用Xavier初始化,即param_attr = fluid.initializer.Xavier(uniform=False))

3. 生成訓練/測試列表

# 生成圖像列表
data_path = '/home/aistudio/data/data23668/Dataset'
character_folders = os.listdir(data_path)
# print(character_folders)
if(os.path.exists('./train_data.list')):
    os.remove('./train_data.list')
if(os.path.exists('./test_data.list')):
    os.remove('./test_data.list')
    
for character_folder in character_folders:
    
    with open('./train_data.list', 'a') as f_train:
        with open('./test_data.list', 'a') as f_test:
            if character_folder == '.DS_Store':
                continue
            character_imgs = os.listdir(os.path.join(data_path,character_folder))
            count = 0 
            for img in character_imgs:
                if img =='.DS_Store':
                    continue
                if count%10 == 0:
                    f_test.write(os.path.join(data_path,character_folder,img) + '\t' + character_folder + '\n')
                else:
                    f_train.write(os.path.join(data_path,character_folder,img) + '\t' + character_folder + '\n')
                count +=1
print('列表已生成')

這段主要是生成訓練/測試的數據路徑,查看train_data.list 即可知道:
在這裏插入圖片描述
但是發現數據沒有隨機分佈,這樣是不利於訓練,所以打亂數據:shuffle_list('./train_data.list')

def shuffle_list(readFile, writeFile):
    with open(readFile, 'r') as f:
            lines = f.readlines()
            from random import shuffle
            shuffle(lines) #打亂列表
    with open(writeFile, 'w') as f:
            # print(lines) 
            for data in lines:
                f.write(str(data))
            f.close()

在這裏插入圖片描述

4.定義訓練集和測試集的reader

和官網給的例程不同,這裏我使用了數據增強:1.隨機旋轉。2.隨機對稱。通過數據增強的方式,來提高訓練量,減少過擬合。當然圖像會除255,是爲了歸一化數據到0-1之間。(但我好像忘了用。。)

# 定義訓練集和測試集的reader
def data_mapper_train(sample ,enhance=True):
    img, label = sample
    img = Image.open(img)
    img = img.resize((100, 100), Image.ANTIALIAS)
    if enhance == True:
        # 數據增強
        # 隨機逆時針旋轉的角度
        angle = random.randint(0,15)
        f = random.randint(0,1)
        if f > 0:
            img = img.rotate(angle)
        else:
            img = img.rotate(-angle)
        #隨機水平翻轉
        flag = random.randint(0,1)
        if flag > 0:
            img = img.transpose(Image.FLIP_LEFT_RIGHT)
    img = np.array(img).astype('float32')
    img = img.transpose((2, 0, 1))
    img = img/255.0
    return img, label
def data_mapper_test(sample):
    img, label = sample
    img = Image.open(img)
    img = img.resize((100, 100), Image.ANTIALIAS)
    img = np.array(img).astype('float32')
    img = img.transpose((2, 0, 1))
    img = img/255.0
    return img, label

5.訓練/測試過程可視化

定義了4個list用來存儲關鍵數據:

    trainAccList = list()
    trainLossList =list()

同時對官方例程進行了修改,由於使用的損失函數是交叉熵:loss = fluid.layers.cross_entropy(predict, label),當值出現極大或者極小的時候,loss容易出現nan的情況,因此使用softmaxt把整個數據限制到0-1之間:loss = fluid.layers.softmax_with_cross_entropy(predict, label)。同時學習策略使用餘弦退火:fluid.layers.cosine_decay

完整代碼:

import os
import time
import random
import numpy as np
from PIL import Image
import matplotlib.pyplot as plt
import paddle
import paddle.fluid as fluid
import paddle.fluid.layers as layers
from multiprocessing import cpu_count

import numpy as np
import paddle
import paddle.fluid as fluid
from paddle.fluid.layer_helper import LayerHelper
from paddle.fluid.dygraph.nn import Conv2D, Pool2D, BatchNorm, Linear
from paddle.fluid.dygraph.base import to_variable


# from paddle.fluid.dygraph import Pool2D,Conv2D
# from paddle.fluid.dygraph import Linear
# ResNet中使用了BatchNorm層,在卷積層的後面加上BatchNorm以提升數值穩定性
# 定義卷積批歸一化塊
class ConvBNLayer(fluid.dygraph.Layer):
    def __init__(self,
                 num_channels,
                 num_filters,
                 filter_size,
                 stride=1,
                 groups=1,
                 act=None,
                 param_attr = fluid.initializer.Xavier(uniform=False)):
        """
        name_scope, 模塊的名字
        num_channels, 卷積層的輸入通道數
        num_filters, 卷積層的輸出通道數
        stride, 卷積層的步幅
        groups, 分組卷積的組數,默認groups=1不使用分組卷積
        act, 激活函數類型,默認act=None不使用激活函數
        """
        super(ConvBNLayer, self).__init__()

        # 創建卷積層
        self._conv = Conv2D(
            num_channels=num_channels,
            num_filters=num_filters,
            filter_size=filter_size,
            stride=stride,
            padding=(filter_size - 1) // 2,
            groups=groups,
            act=None,
            bias_attr=False,
            param_attr=param_attr)

        # 創建BatchNorm層
        self._batch_norm = BatchNorm(num_filters, act=act)

    def forward(self, inputs):
        y = self._conv(inputs)
        y = self._batch_norm(y)
        return y

# 定義殘差塊
# 每個殘差塊會對輸入圖片做三次卷積,然後跟輸入圖片進行短接
# 如果殘差塊中第三次卷積輸出特徵圖的形狀與輸入不一致,則對輸入圖片做1x1卷積,將其輸出形狀調整成一致
class BottleneckBlock(fluid.dygraph.Layer):
    def __init__(self,
                 name_scope,
                 num_channels,
                 num_filters,
                 stride,
                 shortcut=True):
        super(BottleneckBlock, self).__init__(name_scope)
        # 創建第一個卷積層 1x1
        self.conv0 = ConvBNLayer(
            num_channels=num_channels,
            num_filters=num_filters,
            filter_size=1,
            act='leaky_relu')
        # 創建第二個卷積層 3x3
        self.conv1 = ConvBNLayer(
            num_channels=num_filters,
            num_filters=num_filters,
            filter_size=3,
            stride=stride,
            act='leaky_relu')
        # 創建第三個卷積 1x1,但輸出通道數乘以4
        self.conv2 = ConvBNLayer(
            num_channels=num_filters,
            num_filters=num_filters * 4,
            filter_size=1,
            act=None)

        # 如果conv2的輸出跟此殘差塊的輸入數據形狀一致,則shortcut=True
        # 否則shortcut = False,添加1個1x1的卷積作用在輸入數據上,使其形狀變成跟conv2一致
        if not shortcut:
            self.short = ConvBNLayer(
                num_channels=num_channels,
                num_filters=num_filters * 4,
                filter_size=1,
                stride=stride)

        self.shortcut = shortcut

        self._num_channels_out = num_filters * 4

    def forward(self, inputs):
        y = self.conv0(inputs)
        conv1 = self.conv1(y)
        conv2 = self.conv2(conv1)

        # 如果shortcut=True,直接將inputs跟conv2的輸出相加
        # 否則需要對inputs進行一次卷積,將形狀調整成跟conv2輸出一致
        if self.shortcut:
            short = inputs
        else:
            short = self.short(inputs)

        y = fluid.layers.elementwise_add(x=short, y=conv2)
        layer_helper = LayerHelper(self.full_name(), act='relu')
        return layer_helper.append_activation(y)

# 定義ResNet模型
class ResNet(fluid.dygraph.Layer):
    def __init__(self, name_scope, layers=50, class_dim=1):
        """
        name_scope,模塊名稱
        layers, 網絡層數,可以是50, 101或者152
        class_dim,分類標籤的類別數
        """
        super(ResNet, self).__init__(name_scope)
        self.layers = layers
        supported_layers = [18, 50, 101, 152]
        assert layers in supported_layers, \
            "supported layers are {} but input layer is {}".format(supported_layers, layers)

        if layers == 50:
            #ResNet50包含多個模塊,其中第2到第5個模塊分別包含3、4、6、3個殘差塊
            depth = [3, 4, 6, 3]
        elif layers == 101:
            #ResNet101包含多個模塊,其中第2到第5個模塊分別包含3、4、23、3個殘差塊
            depth = [3, 4, 23, 3]
        elif layers == 152:
            #ResNet50包含多個模塊,其中第2到第5個模塊分別包含3、8、36、3個殘差塊
            depth = [3, 8, 36, 3]
        elif layers == 18:
            #新建ResNet18
            depth = [2, 2, 2, 2]
        # 殘差塊中使用到的卷積的輸出通道數
        num_filters = [64, 128, 256, 512]

        # ResNet的第一個模塊,包含1個7x7卷積,後面跟着1個最大池化層
        self.conv = ConvBNLayer(
            num_channels=3,
            num_filters=64,
            filter_size=7,
            stride=2,
            act='relu')
        self.pool2d_max = Pool2D(
            pool_size=3,
            pool_stride=2,
            pool_padding=1,
            pool_type='max')

        # ResNet的第二到第五個模塊c2、c3、c4、c5
        self.bottleneck_block_list = []
        num_channels = 64
        for block in range(len(depth)):
            shortcut = False
            for i in range(depth[block]):
                bottleneck_block = self.add_sublayer(
                    'bb_%d_%d' % (block, i),
                    BottleneckBlock(
                        self.full_name(),
                        num_channels=num_channels,
                        num_filters=num_filters[block],
                        stride=2 if i == 0 and block != 0 else 1, # c3、c4、c5將會在第一個殘差塊使用stride=2;其餘所有殘差塊stride=1
                        shortcut=shortcut))
                num_channels = bottleneck_block._num_channels_out
                self.bottleneck_block_list.append(bottleneck_block)
                shortcut = True

        # 在c5的輸出特徵圖上使用全局池化
        self.pool2d_avg = Pool2D(pool_size=7, pool_type='avg', global_pooling=True)

        # stdv用來作爲全連接層隨機初始化參數的方差
        import math
        stdv = 1.0 / math.sqrt(2048 * 1.0)
        
        # 創建全連接層,輸出大小爲類別數目
        self.out = Linear(input_dim=2048, output_dim=class_dim,
                      param_attr=fluid.param_attr.ParamAttr(
                          initializer=fluid.initializer.Uniform(-stdv, stdv)))

    def forward(self, inputs):
        y = self.conv(inputs)
        y = self.pool2d_max(y)
        for bottleneck_block in self.bottleneck_block_list:
            y = bottleneck_block(y)
        y = self.pool2d_avg(y)
        y = fluid.layers.reshape(y, [y.shape[0], -1])
        y = self.out(y)
        return y
    

def data_mapper(sample):
    img, label = sample
    img = Image.open(img)
    img = img.resize((100, 100), Image.ANTIALIAS)
    img = np.array(img).astype('float32')
    img = img.transpose((2, 0, 1))
    img = img/255.0
    return img, label
# 定義訓練集和測試集的reader
def data_mapper_train(sample ,enhance=False):
    img, label = sample
    img = Image.open(img)
    img = img.resize((100, 100), Image.ANTIALIAS)
    if enhance == True:
        # 數據增強
        # 隨機逆時針旋轉的角度
        angle = random.randint(0,8)
        f = random.randint(0,1)
        if f > 0:
            img = img.rotate(angle)
        else:
            img = img.rotate(-angle)
        #隨機水平翻轉
        flag = random.randint(0,1)
        if flag > 0:
            img = img.transpose(Image.FLIP_LEFT_RIGHT)
    img = np.array(img).astype('float32')
    img = img.transpose((2, 0, 1))
    img = img/255.0
    return img, label
def data_mapper_test(sample):
    img, label = sample
    img = Image.open(img)
    img = img.resize((100, 100), Image.ANTIALIAS)
    img = np.array(img).astype('float32')
    img = img.transpose((2, 0, 1))
    img = img/255.0
    return img, label
def data_reader(data_list_path, model):
    def reader():
        with open(data_list_path, 'r') as f:
            lines = f.readlines()
            for line in lines:
                img, label = line.split('\t')
                yield img, int(label)
    if model == "train":
        return paddle.reader.xmap_readers(data_mapper_train, reader, cpu_count(), 512)
    elif model == "test":
        return paddle.reader.xmap_readers(data_mapper_test, reader, cpu_count(), 512)
def shuffle_list(readFile, writeFile):
    with open(readFile, 'r') as f:
            lines = f.readlines()
            from random import shuffle
            shuffle(lines) #打亂列表
    with open(writeFile, 'w') as f:
            # print(lines) 
            for data in lines:
                f.write(str(data))
            f.close()
# # 生成圖像列表
# data_path = 'data/data23668/Dataset'
# character_folders = os.listdir(data_path)
# # print(character_folders)
# if (os.path.exists('./train_data.list')):
#     os.remove('./train_data.list')
# if (os.path.exists('./test_data.list')):
#     os.remove('./test_data.list')

# for character_folder in character_folders:

#     with open('./train_data.list', 'a') as f_train:
#         with open('./test_data.list', 'a') as f_test:
#             if character_folder == '.DS_Store':
#                 continue
#             character_imgs = os.listdir(os.path.join(data_path, character_folder))
#             count = 0
#             for img in character_imgs:
#                 if img == '.DS_Store':
#                     continue
#                 if count % 10 == 0:
#                     f_test.write(os.path.join(data_path, character_folder, img) + '\t' + character_folder + '\n')
#                 else:
#                     f_train.write(os.path.join(data_path, character_folder, img) + '\t' + character_folder + '\n')
#                 count += 1
# print('列表已生成')
if __name__ == "__main__":
    
    # 打亂訓練集
    # shuffle_list('./train_data.list', './shuffle_train_data.list')
    # 用於訓練的數據提供器
    train_reader = paddle.batch(reader=paddle.reader.shuffle(reader=data_reader('./shuffle_train_data.list', model="train"), buf_size=256), batch_size=32)
    # 用於測試的數據提供器
    test_reader = paddle.batch(reader=data_reader('./test_data.list', model="test"), batch_size=32)
    testAccList = list()
    testLossList =list()
    trainAccList = list()
    trainLossList =list()
    with fluid.dygraph.guard():
        # model = DensenNet(True)  # 模型實例化
        model = ResNet("ResNet", layers = 18, class_dim = 10)
        model.train()  # 訓練模式
        # opt = fluid.optimizer.SGDOptimizer(learning_rate=0.01,
        #                                    parameter_list=model.parameters())  # 優化器選用SGD隨機梯度下降,學習率爲0.001.
        # opt = fluid.optimizer.Momentum(learning_rate=0.001, momentum=0.9, parameter_list=model.parameters())
        opt=fluid.optimizer.AdamOptimizer(learning_rate=fluid.layers.cosine_decay( learning_rate = 1e-3, step_each_epoch=1000, epochs=60), parameter_list=model.parameters())
        epochs_num = 100  # 迭代次數
    
        for pass_num in range(epochs_num):
            trainACC = 0
            trainLoss = 0
            count = 0
            for batch_id, data in enumerate(train_reader()):
    
                images = np.array([x[0].reshape(3, 100, 100) for x in data], np.float32)
    
                labels = np.array([x[1] for x in data]).astype('int64')
                labels = labels[:, np.newaxis]
                # print(images.shape)
                image = fluid.dygraph.to_variable(images)
                label = fluid.dygraph.to_variable(labels)
                predict = model(image)  # 預測
                # print(predict)
                sf_predict = fluid.layers.softmax(predict)
                # loss = fluid.layers.cross_entropy(predict, label)
                loss = fluid.layers.softmax_with_cross_entropy(predict, label)
                avg_loss = fluid.layers.mean(loss)  # 獲取loss值
    
                acc = fluid.layers.accuracy(sf_predict, label)  # 計算精度
                trainACC += acc.numpy()
                trainLoss += avg_loss.numpy()
                if batch_id != 0 and batch_id % 50 == 0:
                    print(
                        "train_pass:{},batch_id:{},train_loss:{},train_acc:{}".format(pass_num, batch_id, avg_loss.numpy(),
                                                                                      acc.numpy()))
    
                avg_loss.backward()
                opt.minimize(avg_loss)
                model.clear_gradients()
                count = batch_id
            trainAccList.append(trainACC/(count + 1))
            trainLossList.append(trainLoss/(count + 1))


        # 繪製
        plt.figure(dpi = 120)    
        train_x = range(len(trainAccList))
        train_y = trainAccList   
        plt.plot(train_x, train_y, label='Train')
     
        plt.legend(loc='upper right')
        plt.ylabel('ACC')
        plt.xlabel('Epoch')
        plt.savefig("ACC.png")
        plt.show()  
            
        plt.figure(dpi = 120)    
        train_x = range(len(trainLossList))
        train_y = trainLossList   
        plt.plot(train_x, train_y, label='Train')
    
        plt.legend(loc='upper right')
        plt.ylabel('Loss')
        plt.xlabel('Epoch')
        plt.savefig("Loss.png")
        plt.show()  
                    
            
        fluid.save_dygraph(model.state_dict(), 'MyDNN')  # 保存模型
    # 模型校驗
    with fluid.dygraph.guard():
        accs = []
        model_dict, _ = fluid.load_dygraph('MyDNN')
        # model = DensenNet()
        model = ResNet("ResNet", layers = 18, class_dim = 10)
        model.load_dict(model_dict)  # 加載模型參數
        model.eval()  # 訓練模式
        for batch_id, data in enumerate(test_reader()):  # 測試集
            images = np.array([x[0].reshape(3, 100, 100) for x in data], np.float32)
            labels = np.array([x[1] for x in data]).astype('int64')
            labels = labels[:, np.newaxis]
            image = fluid.dygraph.to_variable(images)
            label = fluid.dygraph.to_variable(labels)
            predict = model(image)
            acc = fluid.layers.accuracy(predict, label)
            accs.append(acc.numpy()[0])
            avg_acc = np.mean(accs)
        print(avg_acc)

輸出:

......
train_pass:91,batch_id:50,train_loss:[7.75933e-06],train_acc:[1.]
train_pass:92,batch_id:50,train_loss:[9.98374e-07],train_acc:[1.]
train_pass:93,batch_id:50,train_loss:[4.098817e-05],train_acc:[1.]
train_pass:94,batch_id:50,train_loss:[1.8738103e-06],train_acc:[1.]
train_pass:95,batch_id:50,train_loss:[5.029129e-07],train_acc:[1.]
train_pass:96,batch_id:50,train_loss:[1.7801412e-05],train_acc:[1.]
train_pass:97,batch_id:50,train_loss:[7.979372e-06],train_acc:[1.]
train_pass:98,batch_id:50,train_loss:[1.0058262e-06],train_acc:[1.]
train_pass:99,batch_id:50,train_loss:[1.906916e-05],train_acc:[1.]
1.0

在這裏插入圖片描述
在這裏插入圖片描述
在雲端訓練測試集準確99%,算是不錯的精度:
在這裏插入圖片描述

6.模型測試

#讀取預測圖像,進行預測
def load_image(path):
    img = Image.open(path)
    img = img.resize((100, 100), Image.ANTIALIAS)
    img = np.array(img).astype('float32')
    img = img.transpose((2, 0, 1))
    img = img/255.0
    print(img.shape)
    return img

#構建預測動態圖過程
with fluid.dygraph.guard():
    infer_path = '手勢.JPG'
    model=ResNet("ResNet", layers = 18, class_dim = 10)
    model_dict,_=fluid.load_dygraph('MyDNN')
    model.load_dict(model_dict)#加載模型參數
    model.eval()#評估模式
    infer_img = load_image(infer_path)
    infer_img=np.array(infer_img).astype('float32')
    infer_img=infer_img[np.newaxis,:, : ,:]
    infer_img = fluid.dygraph.to_variable(infer_img)
    result=model(infer_img)
    display(Image.open('手勢.JPG'))
    print(np.argmax(result.numpy()))

輸出:

(3, 100, 100)
5

在這裏插入圖片描述

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章