說明:
這是優達學城的一個機器學習作業項目,
我覺得還比較典型綜合了幾個常見的深度學習技術,值得分享一下;實現包括,數據增廣,遷移學習,網絡模型構建,訓練,評估方法等。
這裏只是做了一個遷移學習的實現,重在實踐過程,其原理沒做分析。
缺點:由於訓練數據集規模較小,訓練的數據,不確定精確反映網絡模型性能;比如ResNet50加載預訓練模型權重,相比不加載(隨機值),訓練後精確度反而稍微較低,與理論不符。
項目3:人臉識別
歡迎來到機器學習工程師納米學位的第三個項目!在此文件中,有些示例代碼已經提供給你,但你還需要實現更多的功能讓項目成功運行。除非有明確要求,你無須修改任何已給出的代碼。每一部分都會有詳細的指導,需要實現的部分也會在註釋中以’TODO’標出。請仔細閱讀所有的提示!
除了實現代碼外,你還必須回答一些與項目和你的實現有關的問題。每一個需要你回答的問題都會以’問題 X’爲標題。請仔細閱讀每個問題,並且在問題後的’回答問題’文字框中寫出完整的答案。我們將根據你對問題的回答和撰寫代碼所實現的功能來對你提交的項目進行評分。
任務介紹
人臉識別是一個計算機視覺任務,任務是要通過一張帶有人臉的圖像,對圖像中的人臉進行識別並判斷是誰。關於人臉識別的任務,我們一定會用到2015年Google開發的FaceNet,這個模型由於其性能非常好而被廣泛使用,並且該訓練好的模型已經被開源。
因此,本項目的任務將要學習人臉識別任務,在此項目中,我們將先按課程所學到的知識親手搭建一個卷積神經網絡,然後,我們將用高級的網絡結構,比如ResNet50再次進行人臉識別任務,最後我們將用到預訓練好的FaceNet模型。在這個過程中,我們還會用到數據增強和人臉抽取技術來提升人臉識別的精確度。
在這個人臉識別項目中,我們將使用一個開源數據集Five Celebrity Faces Dataset,這也是一個在Kaggle比賽中的一個數據集。我們也已經下載好了並放在./5-celebrity-faces-dataset
中,數據集中包含五位名人的照片,Ben Affleck, Elton John, Jerry Seinfeld, Madonna, Mindy Kaling。文件下分train
和val
。
數據準備
我們首先要簡單的觀察數據,然後通過數據增強和人臉抽取技術對數據圖像數據進行抽取。你需要在完成這些操作後,思考並回答相關的問題。
顯示一張圖像
所有 train下面的圖像文件名都存入 images 列表中,並將該圖像的人名按順序存於 images_name 中
import cv2
import matplotlib.pyplot as plt
import os
import random
import pandas as pd
%matplotlib inline
data_root = "./5-celebrity-faces-dataset/train/"
import csv
def read_file_log(pathName, cvsName):
with open(cvsName, 'w',encoding='utf-8') as f_cvs:
csv_writer = csv.writer(f_cvs)
#csv_writer.writerow(['fileNale', 'label'])
#csv_writer.writerow(["file_path", "name"])
all_dirs = os.listdir(pathName)
for dir_name in all_dirs:
all_files = os.listdir(pathName+dir_name)
for file_name in all_files:
child = os.path.join('%s/%s/%s' % (pathName, dir_name,file_name))
label = dir_name
##print child.decode('gbk') # .decode('gbk')是解決中文顯示亂碼問題
#print(child,label)
csv_writer.writerow([child, label])
train_log_file = './5-celebrity-faces-dataset/train_log.csv'
read_file_log(data_root,train_log_file)
def read_csv(file):
with open(file) as csvfile:
reader = csv.reader(csvfile)
images = []
images_name = []
for line in reader:
images.append(line[0])
images_name.append(line[1])
return images,images_name
# TODO: 把所有 train下面的圖像文件名都存入 images 列表中,並將該圖像的人名按順序存於 images_name 中
images,images_name = read_csv(train_log_file)
print(images[0])
print(images_name[0])
./5-celebrity-faces-dataset/train//elton_john/httpssmediacacheakpinimgcomxfefdacfbfdeadajpg.jpg
elton_john
從images
中隨機讀取一張圖像,使用 cv2.imread
讀取圖像,然後使用pyplot.imshow
顯示圖像。注意:你需要同時顯示該圖像對應的人名,以及打印該圖像的shape
。
# TODO: 從images 中隨機讀取一張圖像,並獲得該圖像中的人名
from random import randrange
def random_sample(images=images, images_name=images_name):
print("隨機選取一張照片:")
random_index = randrange(0,len(images))
# TODO:從 images 和 images_name 隨機讀取一個圖像文件路徑以及該圖像的人名
im_file, im_name = images[random_index],images_name[random_index]
# TODO:使用 cv2.imread 讀取圖像文件
img = cv2.imread(im_file)
img2=cv2.cvtColor(img, cv2.COLOR_BGR2RGB) # cv2默認爲bgr順序
# TODO:使用 plt.imshow 和 plt.show() 顯示圖像
plt.imshow(img2)
plt.show()
# 打印該圖像的人名
print(im_name)
# 打印該圖像的大小 shape
print(img.shape)
return im_file, im_name
random_sample(images, images_name)
隨機選取一張照片:
elton_john
(353, 236, 3)
('./5-celebrity-faces-dataset/train//elton_john/httpssmediacacheakpinimgcomxfefdacfbfdeadajpg.jpg',
'elton_john')
你可以多次運行上面的代碼來多觀察一些人物圖像,以此來對數據有一個大致的認知
用cv2.imread
讀取所有數據並存入train_x
中,然後用 0,1,2,3,4 來標記 Ben Affleck, Elton John, Jerry Seinfeld, Madonna, Mindy Kaling,並將所有images_name
數據存入train_y
中。
train_x = []
train_y = []
dict_name = {'ben_afflek':0,
'elton_john':1,
'jerry_seinfeld':2,
'madonna':3,
'mindy_kaling':4}
for file,name in zip(images,images_name):
#print(file,name)
train_x.append(cv2.imread(file))
train_y.append(dict_name[name])
print(train_y)
[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4]
數據增強
首先,我們需要打印訓練集的數量。
print(len(images))
93
我們可以看到,訓練集的圖像數量比較少,這對模型建模並學習圖像數據到人名標籤的映射關係增加了難度,所以這裏需要做數據增強的工作。這裏有一份很不錯的資料可以幫助你瞭解數據增強——Data Augmentation。
from keras.preprocessing.image import ImageDataGenerator
# TODO: 構造圖像數據增強器
data_gen = ImageDataGenerator(
rescale = .1, # TODO:隨機縮放圖像RGB值的倍數
rotation_range =0.15 , # TODO:隨機旋轉圖像的範圍
zoom_range = 0.1, # TODO:隨機縮放圖像大小範圍
width_shift_range = 0.2, # TODO:隨機水平方向平移圖像(fraction of total width)
height_shift_range= 0.2, # TODO:隨機縱向平移圖像(fraction of total height)
horizontal_flip=True,
)
Using TensorFlow backend.
使用 flow_from_directory
遍歷數據集 ./5-celebrity-faces-dataset/data
,來觀察數據增強的表現。先得到一個圖像迭代器,該迭代器每次都從路徑裏讀取一個圖像,並按照數據增強器的規則進行編輯圖像
# 直接運行,得到一個圖像迭代器,該迭代器每次都從路徑裏讀取一個圖像,並按照數據增強器的規則進行編輯圖像
dataflow_generator = data_gen.flow_from_directory(
"./5-celebrity-faces-dataset/data",
target_size=(160, 160),
batch_size=3,
color_mode='rgb',
class_mode='categorical')
print(dataflow_generator.filenames)
Found 5 images belonging to 5 classes.
['ben_afflek/httpcsvkmeuaeccjpg.jpg', 'elton_john/httpftqncomymusicLxZeltonjohnjpg.jpg', 'jerry_seinfeld/httpgraphicsnytimescomimagessectionmoviesfilmographyWireImagejpg.jpg', 'madonna/httpiamediaimdbcomimagesMMVBMTANDQNTAxNDVeQTJeQWpwZBbWUMDIMjQOTYVUXCRALjpg.jpg', 'mindy_kaling/httpgonetworthcomwpcontentuploadsthumbsjpg.jpg']
# TODO:從 迭代器中 讀取10張圖片,並顯示圖像
from keras.preprocessing.image import array_to_img
sample_count = 10
i = 0
filenames = dataflow_generator.filenames
labels = dataflow_generator.class_indices
print(filenames)
print(len(filenames))
print(labels)
#此處image_data是一個二維序列,
#image_data[0][...]存放batch_size張圖片
#image_data[1][...]存放batch_size對應標籤
for image_data in dataflow_generator:
# TODO:使用 plt.imshow 和 plt.show() 顯示圖像
print(len(image_data[1]))
for j in range(0,len(image_data[1])):
if i >= 12:
break
plt.subplot(3,5,1+i)
image = image_data[0][j].astype('uint8')
#print(type(image_data))
plt.imshow(array_to_img(image))
#plt.imshow(image)
i += 1
print(image_data[1][j]) #label
print(image_data[0][0].shape) #image
sample_count -= 1
if sample_count <= 0:
break
['ben_afflek/httpcsvkmeuaeccjpg.jpg', 'elton_john/httpftqncomymusicLxZeltonjohnjpg.jpg', 'jerry_seinfeld/httpgraphicsnytimescomimagessectionmoviesfilmographyWireImagejpg.jpg', 'madonna/httpiamediaimdbcomimagesMMVBMTANDQNTAxNDVeQTJeQWpwZBbWUMDIMjQOTYVUXCRALjpg.jpg', 'mindy_kaling/httpgonetworthcomwpcontentuploadsthumbsjpg.jpg']
5
{'ben_afflek': 0, 'elton_john': 1, 'jerry_seinfeld': 2, 'madonna': 3, 'mindy_kaling': 4}
3
[0. 0. 0. 0. 1.]
(160, 160, 3)
[0. 1. 0. 0. 0.]
(160, 160, 3)
[0. 0. 1. 0. 0.]
(160, 160, 3)
2
[0. 0. 0. 1. 0.]
(160, 160, 3)
[1. 0. 0. 0. 0.]
(160, 160, 3)
3
[0. 0. 0. 1. 0.]
(160, 160, 3)
[0. 0. 0. 0. 1.]
(160, 160, 3)
[0. 1. 0. 0. 0.]
(160, 160, 3)
2
[0. 0. 1. 0. 0.]
(160, 160, 3)
[1. 0. 0. 0. 0.]
(160, 160, 3)
3
[0. 0. 1. 0. 0.]
(160, 160, 3)
[0. 0. 0. 1. 0.]
(160, 160, 3)
2
3
2
3
2
問題1:觀察以上人臉圖像,簡單說說產生的圖像中存在哪些增強的部分,然後再詳細闡述你對數據增強的思考,包括爲什麼數據增強能夠幫助人臉識別?你需要參考一些論文,並列出你的引用。
問題回答:
左右平移,上下平移,旋轉
用CNN處理圖片特徵,具有平移不變形,旋轉不變性。
因爲對同一個人拍照,可以採取不同角度和構圖,所以做數據增強可以更好泛化
數據增強能夠補充數據數量,防止過擬合,增強模型的泛化能力。數據增強有助於產生更多的數據訓練網絡。增強模型的泛化的性能,使網絡能夠泛化到不在訓練集中的圖像。正如本項目的訓練集的圖像數量比較少,模型建模並學習就很困難。通過數據增強的方式,就可以在很少數據情況下產生足夠多的數據,建立圖片分類器。
增強模型的泛化的性能,一般的手段有數據增強和正則化方法,而用於數據增強的一般方法有:隨機裁剪、隨機水平翻轉、平移、旋轉、增加噪音和生成網絡方法等(前兩個方法用的最多,也最有效)。
人臉抽取
在做人臉識別任務中,一項常用的圖像數據處理的技術是人臉檢測(Face Detection)。人臉檢測是將輸入的圖片中的人臉部分自動檢測出來,具體來說就是要通過預測一個矩形邊界框(Bounding Box)從整個圖像中定位人臉部分,這裏的矩形邊界框由矩形左下角座標以及矩形高和寬來定義。人臉檢測是一個比較成熟的任務,接下來在我們這個項目中,我們將使用 Multi-Task Cascaded Convolutional Neural Network,MTCNN,你也可以參考論文:Joint Face Detection and Alignment Using Multitask Cascaded Convolutional Networks,來學習人臉檢測任務。
# 運行下面代碼,安裝 mtcnn
!pip install mtcnn
Requirement already satisfied: mtcnn in /home/leon/anaconda3/lib/python3.7/site-packages (0.1.0)
Requirement already satisfied: keras>=2.0.0 in /home/leon/anaconda3/lib/python3.7/site-packages (from mtcnn) (2.3.1)
Requirement already satisfied: opencv-python>=4.1.0 in /home/leon/anaconda3/lib/python3.7/site-packages (from mtcnn) (4.2.0.32)
Requirement already satisfied: keras-applications>=1.0.6 in /home/leon/anaconda3/lib/python3.7/site-packages (from keras>=2.0.0->mtcnn) (1.0.8)
Requirement already satisfied: keras-preprocessing>=1.0.5 in /home/leon/anaconda3/lib/python3.7/site-packages (from keras>=2.0.0->mtcnn) (1.1.0)
Requirement already satisfied: six>=1.9.0 in /home/leon/anaconda3/lib/python3.7/site-packages (from keras>=2.0.0->mtcnn) (1.13.0)
Requirement already satisfied: pyyaml in /home/leon/anaconda3/lib/python3.7/site-packages (from keras>=2.0.0->mtcnn) (5.2)
Requirement already satisfied: h5py in /home/leon/anaconda3/lib/python3.7/site-packages (from keras>=2.0.0->mtcnn) (2.8.0)
Requirement already satisfied: scipy>=0.14 in /home/leon/anaconda3/lib/python3.7/site-packages (from keras>=2.0.0->mtcnn) (1.3.2)
Requirement already satisfied: numpy>=1.9.1 in /home/leon/anaconda3/lib/python3.7/site-packages (from keras>=2.0.0->mtcnn) (1.17.4)
# 定義人臉抽取的函數
from PIL import Image
from mtcnn.mtcnn import MTCNN
import numpy as np
def extract_face(filename, image_size=(160, 160)):
# 加載圖像
image = Image.open(filename)
# 轉換RGB
image = image.convert('RGB')
# 轉成 numpy.array 格式的數據
image_data = np.asarray(image)
# 創建一個人臉檢測,
detector = MTCNN()
# 從圖像中檢測
results = detector.detect_faces(image_data)
# 返回的結果是圖像中所有出現的人臉的矩形邊界框,由於我們的圖像中只有一張人臉,所所以只需要取結果中第一個
box_x, box_y, width, height = results[0]['box']
# 處理下標爲負的情況
box_x, box_y = abs(box_x), abs(box_y)
box_x_up, box_y_up = box_x + width, box_y + height
# 獲得人臉部分的數據
face = image_data[box_y:box_y_up, box_x:box_x_up]
print("face.shape",face.shape)
# TODO:把抽取出來的人臉圖像 resize 至需要的圖像大小,並返回numpy格式的數據
#face_array = cv2.resize(face,image_size,interpolation=cv2.INTER_CUBIC)
image = Image.fromarray(face)
image = image.resize(image_size)
face_array = np.asarray(image)
return face_array
ran_img_file, ran_img_name = random_sample()
img = extract_face(ran_img_file)
plt.imshow(img)
#plt.show()
print(img.shape)
隨機選取一張照片:
madonna
(315, 214, 3)
WARNING:tensorflow:From /home/leon/anaconda3/lib/python3.7/site-packages/keras/backend/tensorflow_backend.py:4070: The name tf.nn.max_pool is deprecated. Please use tf.nn.max_pool2d instead.
WARNING:tensorflow:From /home/leon/anaconda3/lib/python3.7/site-packages/keras/backend/tensorflow_backend.py:422: The name tf.global_variables is deprecated. Please use tf.compat.v1.global_variables instead.
face.shape (44, 33, 3)
(160, 160, 3)
問題2: 通過多次運行以上代碼並觀察人臉抽取後的圖像,你認爲人臉檢測對人臉識別有幫助嗎?爲什麼?你需要參考一些論文,並列出你的reference。
回答問題:
當然有幫助,只有準確檢測出人臉區域,才能進一步做人臉識別
從數據整理的角度,通過檢測是否有圖像包含人臉,可以更好的清理訓練圖像數據。
人臉識別的所需的信息既有特徵,又有結構。人類檢測的矩形邊界框定義人臉,其實是最基本的結構。
人臉識別過程中常常要做人臉對齊(Face Alignment),而人臉檢測是進行這一步驟的基礎。對齊的人類圖像,會降低模型識別的難度。
現實場景的圖像可能包括大量背景信息,而不是我們關注的人類。人類對齊可以有助於把人臉和背景分離,更好的讓模型專注於人類的特徵。
Bruce Cheen. 人臉檢測和人臉識別.
https://blog.csdn.net/czp_374/article/details/81162923
如何應用MTCNN和FaceNet模型實現人臉檢測及識別.
http://www.uml.org.cn/ai/201806124.asp
構造數據
現在我們可以應用以上的數據增強和人臉檢測技術來構造完整的數據。
這裏可以直接使用前面定義好的圖像數據增強器 data_gen
,然後使用 ImageDataGenerator中的 random_transform
對單個圖像做隨機增強操作。
另外,在構造數據之前,你需要先構造一個人名到類別的映射,使得在構造數據的label的時候將string格式的人名轉換爲int格式的類別。
編程練習:
- 你需要 構造人名字典,將 ben_afflek、elton_john、jerry_seinfeld、madonna、mindy_kaling 分別映射到 0-1-2-3-4
- 定義
load_dataset
函數,遍歷train或者val文件夾,讀取文件夾下5個人名文件夾,以該文件夾名映射至0到4 的標籤;然後分別從人名文件夾中遍歷所有圖像文件,讀取圖像,如果是train文件夾下的圖像,則需要用data_gen.random_transform
來增強圖像數據,增強次數爲augment_times;如果是val文件夾下的圖像,則不需要進行圖像增強
def extract_face2(detector,image, image_size=(160, 160)):
# 轉成 numpy.array 格式的數據
image_data = image
# 創建一個人臉檢測,
# 從圖像中檢測
results = detector.detect_faces(image_data)
flag = True
if len(results) > 0:
# 返回的結果是圖像中所有出現的人臉的矩形邊界框,由於我們的圖像中只有一張人臉,所所以只需要取結果中第一個
box_x, box_y, width, height = results[0]['box']
# 處理下標爲負的情況
box_x, box_y = abs(box_x), abs(box_y)
box_x_up, box_y_up = box_x + width, box_y + height
# 獲得人臉部分的數據
face = image_data[box_y:box_y_up, box_x:box_x_up]
#print("face.shape",face.shape)
# TODO:把抽取出來的人臉圖像 resize 至需要的圖像大小,並返回numpy格式的數據
flag = True
face_array = cv2.resize(face,image_size,interpolation=cv2.INTER_CUBIC)
else:
flag = False
face_array = None
return flag,face_array
# TODO:構造人名字典,將 ben_afflek、elton_john、jerry_seinfeld、madonna、mindy_kaling 分別映射到 0-1-2-3-4
name_dict = {'ben_afflek':0,
'elton_john':1,
'jerry_seinfeld':2,
'madonna':3,
'mindy_kaling':4}
# TODO:定義數據加載函數,data_dir爲文件路徑,augment_times爲數據增強次數,is_train爲判斷是訓練集還是測試集(測試集不需要數據增強)
def load_dataset2(data_dir = "./5-celebrity-faces-dataset/train/", augment_times=2, is_train=True):
data_x = []
data_y = []
images = []
labels = []
if is_train:
data_gen2 = ImageDataGenerator(rescale = .3, rotation_range =0.2 , zoom_range = 0.2, width_shift_range = 0.2, height_shift_range= 0.2)
dataflow_generator2 = data_gen2.flow_from_directory(data_dir,target_size=(160, 160),batch_size=1,color_mode='rgb',class_mode='categorical')
labels_dict = dataflow_generator2.class_indices
print("labels_dict",labels_dict)
sample_count = len(dataflow_generator2.filenames)*augment_times
print('sample_count:',sample_count)
#filenames = dataflow_generator2.filenames
#labels = dataflow_generator2.class_indices
#print(filenames)
#print(labels)
for image_data in dataflow_generator:
# TODO:使用 plt.imshow 和 plt.show() 顯示圖像
#print(len(image_data[1]))
for j in range(0,len(image_data[1])):
image = image_data[0][j].astype('uint8')
images.append(image)
labels.append(image_data[1][j])
#print(image_data[1]) #label
#print(image_data[0][0].shape) #image
sample_count -= 1
if sample_count <= 0:
images = np.array(images)
labels = np.array(labels)
break
else:
images = []
labels = []
all_dirs = os.listdir(data_dir)
print("all_dirs:",all_dirs)
for dir_name in all_dirs:
all_files = os.listdir(data_dir+dir_name)
for file_name in all_files:
im_file = os.path.join('%s/%s/%s' % (data_dir, dir_name,file_name))
img = Image.open(filename)
image = image.convert('RGB')
label = dir_name
images.append(img)
labels.append(name_dict[label])
#print("im_file:{},label:{}".format(im_file,label))
images = np.array(images)
labels = np.eye(5)[np.array(labels)]
detector = MTCNN()
for i in range(0,len(labels)):
flag,face = extract_face2(detector,images[i])
if flag:
data_x.append(face)
data_y.append(labels[i])
data_x =np.array(data_x)
data_y =np.array(data_y)
return data_x, data_y
def extract_face3(detector, filename, image_size=(160, 160)):
# 加載圖像
image = Image.open(filename)
# 轉換RGB
image = image.convert('RGB')
# 轉成 numpy.array 格式的數據
image_data = np.asarray(image)
# 從圖像中檢測
results = detector.detect_faces(image_data)
# 返回的結果是圖像中所有出現的人臉的矩形邊界框,由於我們的圖像中只有一張人臉,所所以只需要取結果中第一個
box_x, box_y, width, height = results[0]['box']
# 處理下標爲負的情況
box_x, box_y = abs(box_x), abs(box_y)
box_x_up, box_y_up = box_x + width, box_y + height
# 獲得人臉部分的數據
face = image_data[box_y:box_y_up, box_x:box_x_up]
# TODO:把抽取出來的人臉圖像 resize 至需要的圖像大小,並返回numpy格式的數據
image = Image.fromarray(face)
image = image.resize(image_size)
face_array = np.asarray(image)
return face_array
# TODO:構造人名字典,將 ben_afflek、elton_john、jerry_seinfeld、madonna、mindy_kaling 分別映射到 0-1-2-3-4
name_dict = {'ben_afflek':0,
'elton_john':1,
'jerry_seinfeld':2,
'madonna':3,
'mindy_kaling':4}
# TODO:定義數據加載函數,data_dir爲文件路徑,augment_times爲數據增強次數,is_train爲判斷是訓練集還是測試集(測試集不需要數據增強)
def load_dataset(data_dir = "./5-celebrity-faces-dataset/train/", augment_times=2, is_train=True):
data_x = []
data_y = []
detector = MTCNN()
# TODO:
for subdir in os.listdir(data_dir):
path = os.path.join(data_dir, subdir)
for filename in os.listdir(path):
face = extract_face3(detector,os.path.join(path, filename))
#print(face.shape)
data_x.append(face)
data_y.append(name_dict[subdir])
# 如果是測試數據,則不需要進行數據增強
if is_train:
for _ in range(augment_times):
face_aug = data_gen.random_transform(face)
data_x.append(face_aug)
data_y.append(name_dict[subdir])
return data_x, data_y
train_x, train_y = load_dataset("./5-celebrity-faces-dataset/train/", augment_times=2, is_train=True)
test_x, test_y = load_dataset("./5-celebrity-faces-dataset/val/", is_train=False)
# 最終構造好訓練和測試數據
train_X = np.asarray(train_x)
train_Y = np.eye(5)[np.array(train_y)]
test_X = np.asarray(test_x)
test_Y = np.eye(5)[np.array(test_y)]
print(train_X.shape)
print(train_Y.shape,train_Y)
(279, 160, 160, 3)
(279, 5) [[0. 1. 0. 0. 0.]
[0. 1. 0. 0. 0.]
[0. 1. 0. 0. 0.]
...
[0. 0. 0. 0. 1.]
[0. 0. 0. 0. 1.]
[0. 0. 0. 0. 1.]]
index = [i for i in range(len(train_y))]
random.shuffle(index)
train_X = train_X[index]
train_Y = train_Y[index]
print(len(train_X))
print(train_X[0].shape)
print(len(train_Y))
print(train_Y)
plt.imshow(train_X[10])
279
(160, 160, 3)
279
[[1. 0. 0. 0. 0.]
[0. 0. 0. 0. 1.]
[0. 0. 1. 0. 0.]
...
[0. 0. 1. 0. 0.]
[0. 0. 1. 0. 0.]
[0. 0. 0. 0. 1.]]
<matplotlib.image.AxesImage at 0x7f59b0724be0>
構建一個卷積神經網絡
創建一個卷積神經網絡來對人臉進行分類。在你代碼塊的最後,執行 model.summary()
來輸出你模型的總結信息。
。
問題3: 在下方的代碼塊中嘗試使用 Keras 搭建卷積網絡的架構,並回答相關的問題。
- 你可以嘗試自己搭建一個卷積網絡的模型,那麼你需要回答你搭建卷積網絡的具體步驟(用了哪些層)以及爲什麼這樣搭建。
- 你也可以根據上圖提示的步驟搭建卷積網絡,那麼請說明如上的架構能夠在該問題上取得的表現。
回答問題:
from keras.layers import Conv2D, MaxPooling2D, GlobalAveragePooling2D
from keras.layers import Dropout, Flatten, Dense
from keras.models import Sequential
model = Sequential()
### TODO: 定義你的網絡架構
model.add(Conv2D(filters=32, kernel_size=3, padding='valid', activation='relu', input_shape=(160, 160, 3)))
#model.add(Conv2D(32, (3,3), input_shape=(160, 160, 3), activation="relu"))
model.add(MaxPooling2D(pool_size=2))
#model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Dropout(0.5))
model.add(Conv2D(filters=64, kernel_size=3, padding='valid', activation='relu'))
model.add(MaxPooling2D(pool_size=2))
model.add(Dropout(0.5))
model.add(Conv2D(filters=128, kernel_size=3, padding='valid', activation='relu'))
model.add(GlobalAveragePooling2D())
model.add(Dropout(0.5))
model.add(Dense(5, activation='softmax'))
model.summary()
Model: "sequential_2"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_40 (Conv2D) (None, 158, 158, 32) 896
_________________________________________________________________
max_pooling2d_21 (MaxPooling (None, 79, 79, 32) 0
_________________________________________________________________
dropout_4 (Dropout) (None, 79, 79, 32) 0
_________________________________________________________________
conv2d_41 (Conv2D) (None, 77, 77, 64) 18496
_________________________________________________________________
max_pooling2d_22 (MaxPooling (None, 38, 38, 64) 0
_________________________________________________________________
dropout_5 (Dropout) (None, 38, 38, 64) 0
_________________________________________________________________
conv2d_42 (Conv2D) (None, 36, 36, 128) 73856
_________________________________________________________________
global_average_pooling2d_2 ( (None, 128) 0
_________________________________________________________________
dropout_6 (Dropout) (None, 128) 0
_________________________________________________________________
dense_23 (Dense) (None, 5) 645
=================================================================
Total params: 93,893
Trainable params: 93,893
Non-trainable params: 0
_________________________________________________________________
keras.callbacks.ModelCheckpoint(filepath, monitor=‘val_loss’, verbose=0, save_best_only=False, save_weights_only=False, mode=‘auto’, period=1)
filepath 可以包括命名格式選項,可以由 epoch 的值和 logs 的鍵(由 on_epoch_end 參數傳遞)來填充。
例如:如果 filepath 是 weights.{epoch:02d}-{val_loss:.2f}.hdf5, 那麼模型被保存的的文件名就會有訓練輪數和驗證損失。
參數
filepath: 字符串,保存模型的路徑。
monitor: 被監測的數據。
verbose: 詳細信息模式,0 或者 1 。
save_best_only: 如果 save_best_only=True, 被監測數據的最佳模型就不會被覆蓋。
mode: {auto, min, max} 的其中之一。 如果 save_best_only=True,那麼是否覆蓋保存文件的決定就取決於被監測數據的最大或者最小值。 對於 val_acc,模式就會是 max,而對於 val_loss,模式就需要是 min,等等。 在 auto 模式中,方向會自動從被監測的數據的名字中判斷出來。
save_weights_only: 如果 True,那麼只有模型的權重會被保存 (model.save_weights(filepath)), 否則的話,整個模型會被保存 (model.save(filepath))。
period: 每個檢查點之間的間隔(訓練輪數)。
from keras.callbacks import ModelCheckpoint
filepath='weights.best.hdf5'
# 有一次提升, 則覆蓋一次.
checkpointer = ModelCheckpoint(filepath='face.weights.best.hdf5', verbose=1, save_best_only=True)
callbacks_list = [checkpoint]
# 直接運行編譯模型和訓練模型
# 編譯模型
model.compile(optimizer='rmsprop', loss='categorical_crossentropy', metrics=['accuracy'])
# 模型訓練
history1 = model.fit(train_X, train_Y, batch_size=16, epochs=50,callbacks=callbacks_list)
Epoch 1/50
279/279 [==============================] - 7s 24ms/step - loss: 39.8889 - accuracy: 0.1828
Epoch 2/50
/home/leon/anaconda3/lib/python3.7/site-packages/keras/callbacks/callbacks.py:707: RuntimeWarning: Can save best model only with val_acc available, skipping.
'skipping.' % (self.monitor), RuntimeWarning)
...
Epoch 49/50
279/279 [==============================] - 6s 23ms/step - loss: 1.0212 - accuracy: 0.5986
Epoch 50/50
279/279 [==============================] - 6s 22ms/step - loss: 0.9619 - accuracy: 0.6057
模型測試
你需要編寫一個自動測試模型準確率的函數。
from sklearn.metrics import fbeta_score, accuracy_score
from sklearn.metrics import classification_report
def metric_accuracy(model, test_X, test_Y, model_name):
preds_Y = model.predict(test_X)
correct = 0.
for pr, y in zip(preds_Y, test_Y):
pr_cls = np.argmax(pr)
if y[pr_cls] == 1:
correct += 1
accuracy = correct / len(preds_Y)
print()
print("%s Accuracy: %.3f" % (model_name, accuracy))
def metric_accuracy2(model, test_X, test_Y, model_name):
preds_Y = model.predict(test_X)
#TODO:通過預測值preds_Y以及真實值test_Y,來計算準確率
#print(preds_Y)
max_index = np.argmax(preds_Y,axis=1) #橫軸比較
print(len(max_index),max_index)
preds_Y = np.zeros( preds_Y.shape)
for i in range(0,len(max_index)):
preds_Y[i][max_index[i]] = 1
#print(len(preds_Y),preds_Y)
print(preds_Y.shape)
print(preds_Y[0])
print("="*88)
#print(len(test_Y),test_Y)
print(test_Y.shape)
print(test_Y[0])
accuracy = accuracy_score(preds_Y,test_Y)
#classification_report(test_Y,preds_Y)
print()
print("%s Accuracy: %.3f" % (model_name, accuracy))
metric_accuracy(model, test_X, test_Y, "Simple CNN")
Simple CNN Accuracy: 0.600
metric_accuracy(model, train_X, train_Y, "Simple CNN")
Simple CNN Accuracy: 0.835
進階 CNN 模型架構,ResNet50
在計算機視覺任務中,有一些複雜的高級CNN模型架構,比如ResNet、VGG、Inception 等等,他們能夠對圖像有一個非常好的表達。並且,已經有人把這些模型在非常大的圖像數據上訓練好了參數,這使得預訓練的大模型能夠對圖像有一個很好的特徵表達。這種在大規模圖像數據上學到的圖像特徵,能夠遷移到人臉圖像的特徵表示。
在這一小節,我們利用預訓練好的 ResNet50,抽取圖像特徵,然後再去做人臉識別。雖然 ResNet50 在各種圖像上面進行預訓練的,但是該模型對圖像結構特徵信息的學習也能夠幫助人臉識別任務中的預測。
import keras
from keras.models import Model, Sequential
from keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Dropout
# 模型底層使用 ResNet50 對原始圖像進行建模,特徵抽取
resnet50_weights = "./models/resnet50_weights.h5"
resnet = keras.applications.resnet50.ResNet50(weights=None, include_top=False, input_shape=(160, 160, 3))
resnet.load_weights(resnet50_weights)
# TODO:自己定義模型頂層,使用抽取後的特徵進行人臉識別
resnet_face = Sequential()
resnet_face.add(Flatten(input_shape=resnet.output_shape[1:]))
#resnet_face.add(Dense(2048, activation="relu"))
#resnet_face.add(Dropout(0.5))
resnet_face.add(Dense(1024, activation="relu"))
resnet_face.add(Dropout(0.5))
resnet_face.add(Dense(5, activation='softmax'))
resnet_face_model = Model(inputs=resnet.input, outputs=resnet_face(resnet.output))
resnet_face_model.summary()
/home/leon/anaconda3/lib/python3.7/site-packages/keras_applications/resnet50.py:265: UserWarning: The output shape of `ResNet50(include_top=False)` has been changed since Keras 2.2.0.
warnings.warn('The output shape of `ResNet50(include_top=False)` '
Model: "model_71"
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_61 (InputLayer) (None, 160, 160, 3) 0
__________________________________________________________________________________________________
conv1_pad (ZeroPadding2D) (None, 166, 166, 3) 0 input_61[0][0]
__________________________________________________________________________________________________
conv1 (Conv2D) (None, 80, 80, 64) 9472 conv1_pad[0][0]
_______________________________________________________________________________________________
bn_conv1 (BatchNormalization) (None, 80, 80, 64) 256 conv1[0][0]
__________________________________________________________________________________________________
activation_295 (Activation) (None, 80, 80, 64) 0 bn_conv1[0][0] _________________________________________________________________________________
...
________________________________________________________________________________________________
bn5c_branch2c (BatchNormalizati (None, 5, 5, 2048) 8192 res5c_branch2c[0][0]
__________________________________________________________________________________________________
add_112 (Add) (None, 5, 5, 2048) 0 bn5c_branch2c[0][0]
activation_340[0][0]
__________________________________________________________________________________________________
activation_343 (Activation) (None, 5, 5, 2048) 0 add_112[0][0]
__________________________________________________________________________________________________
sequential_25 (Sequential) (None, 5) 52434949 activation_343[0][0]
==================================================================================================
Total params: 76,022,661
Trainable params: 75,969,541
Non-trainable params: 53,120
__________________________________________________________________________________________________
print(len(resnet_face_model.layers))
print(resnet_face_model.layers[0])
176
<keras.engine.input_layer.InputLayer object at 0x7f1023f25e80>
for layer in resnet_face_model.layers[:10]:
layer.trainable = False
# 設置同樣的訓練參數,直接運行
## 編譯模型
resnet_face_model.compile(optimizer='rmsprop', loss='categorical_crossentropy', metrics=['accuracy'])
# 模型訓練
resnet_face_model.fit(train_X, train_Y, batch_size=16, epochs=50)
Epoch 1/50
279/279 [==============================] - 38s 137ms/step - loss: 19.1508 - accuracy: 0.3907
...
Epoch 49/50
279/279 [==============================] - 34s 121ms/step - loss: 0.5168 - accuracy: 0.9749
Epoch 50/50
279/279 [==============================] - 34s 121ms/step - loss: 1.2976 - accuracy: 0.9570
模型測試
# 直接運行,測試 resnet_face_model 的準確率
metric_accuracy(resnet_face_model, test_X, test_Y, "ResNet50")
ResNet50 Accuracy: 0.560
metric_accuracy(resnet_face_model, train_X, train_Y, "ResNet50")
ResNet50 Accuracy: 0.556
問題5: 對比 ResNet50 模型和 CNN 模型的結果,請你分析爲什麼 ResNet50 模型能夠取得更好的結果?
回答問題:
1.ResNet50模型相比CNN更深,卷積核更多,可以採樣到更多的特徵。
2.ResNet50模型有結果大量數據訓練過的參數;
問題6: 上面我們使用了預訓練好的 ResNet50,即resnet.load_weights(resnet50_weights)
,那麼加載預訓練好的參數對該任務有幫助嗎?你需要通過做對比實驗,即不加載預訓練好的參數,然後在下面的代碼框中重新跑一遍 ResNet50 的模型,來作爲對比說明加載預訓練是否有幫助
回答問題:
1.預加載參數,模型收斂的更快;
2.從實驗結果看預加載的參數,泛化能力不如重新訓練所有參數?(這點與我認知剛好相反,請老師指點)
# 重新跑一遍不加載預訓練參數的 ResNet50 的模型,請在此處寫完整的code
import keras
from keras.models import Model, Sequential
from keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Dropout
# 模型底層使用 ResNet50 對原始圖像進行建模,特徵抽取
resnet50_weights = "./models/resnet50_weights.h5"
resnet2 = keras.applications.resnet50.ResNet50(weights=None, include_top=False, input_shape=(160, 160, 3))
#resnet.load_weights(resnet50_weights)
# TODO:自己定義模型頂層,使用抽取後的特徵進行人臉識別
resnet_face2 = Sequential()
resnet_face2.add(Flatten(input_shape=resnet2.output_shape[1:]))
resnet_face2.add(Dense(1024, activation="relu"))
resnet_face2.add(Dropout(0.5))
resnet_face2.add(Dense(5, activation="softmax"))
resnet_face_model2 = Model(inputs=resnet2.input, outputs=resnet_face2(resnet2.output))
resnet_face_model2.summary()
Model: "model_33"
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_33 (InputLayer) (None, 160, 160, 3) 0
__________________________________________________________________________________________________
conv1_pad (ZeroPadding2D) (None, 166, 166, 3) 0 input_33[0][0]
__________________________________________________________________________________________________
conv1 (Conv2D) (None, 80, 80, 64) 9472 conv1_pad[0][0]
...
__________________________________________________________________________________________________
add_96 (Add) (None, 5, 5, 2048) 0 bn5c_branch2c[0][0]
activation_291[0][0]
__________________________________________________________________________________________________
activation_294 (Activation) (None, 5, 5, 2048) 0 add_96[0][0]
__________________________________________________________________________________________________
sequential_9 (Sequential) (None, 5) 52434949 activation_294[0][0]
==================================================================================================
Total params: 76,022,661
Trainable params: 75,969,541
Non-trainable params: 53,120
__________________________________________________________________________________________________
/home/leon/anaconda3/lib/python3.7/site-packages/keras_applications/resnet50.py:265: UserWarning: The output shape of `ResNet50(include_top=False)` has been changed since Keras 2.2.0.
warnings.warn('The output shape of `ResNet50(include_top=False)` '
## 編譯模型
resnet_face_model2.compile(optimizer='rmsprop', loss='categorical_crossentropy', metrics=['accuracy'])
# 模型訓練
resnet_face_model2.fit(train_X, train_Y, batch_size=16, epochs=50)
Epoch 1/50
279/279 [==============================] - 39s 138ms/step - loss: 128.2206 - accuracy: 0.2151
...
Epoch 49/50
279/279 [==============================] - 34s 122ms/step - loss: 0.3852 - accuracy: 0.9211
Epoch 50/50
279/279 [==============================] - 34s 121ms/step - loss: 0.0479 - accuracy: 0.9892
metric_accuracy(resnet_face_model2, test_X, test_Y, "ResNet50")
ResNet50 Accuracy: 0.600
metric_accuracy(resnet_face_model2, train_X, train_Y, "ResNet50")
ResNet50 Accuracy: 0.753
FaceNet
上一小節中,我們利用了預訓練好的 ResNet50 來抽取圖像特徵,而這一小節我們將利用預訓練好的 FaceNet 來抽取人臉特徵。我們已經知道 ResNet50 是在大規模數據上建模學習圖像特徵的,這裏面的數據是多種多樣的,不限制於人臉圖像,而 FaceNet 是專門對於人臉進行特徵抽取的工具。
from keras.models import load_model
# load the model
model = load_model('./models/facenet_keras.h5')
# summarize input and output shape
print(model.inputs)
print(model.outputs)
model.summary()
[<tf.Tensor 'input_1_1:0' shape=(?, 160, 160, 3) dtype=float32>]
[<tf.Tensor 'Bottleneck_BatchNorm/cond/Merge:0' shape=(?, 128) dtype=float32>]
Model: "inception_resnet_v1"
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_1 (InputLayer) (None, 160, 160, 3) 0
__________________________________________________________________________________________________
Conv2d_1a_3x3 (Conv2D) (None, 79, 79, 32) 864 input_1[0][0]
__________________________________________________________________________________________________
Conv2d_1a_3x3_BatchNorm (BatchN (None, 79, 79, 32) 96 Conv2d_1a_3x3[0][0]
__________________________________________________________________________________________________
...
__________________________________________________________________________________________________
Block8_6_Branch_0_Conv2d_1x1_Ba (None, 3, 3, 192) 576 Block8_6_Branch_0_Conv2d_1x1[0][0
__________________________________________________________________________________________________
Block8_6_Branch_1_Conv2d_0c_3x1 (None, 3, 3, 192) 576 Block8_6_Branch_1_Conv2d_0c_3x1[0
__________________________________________________________________________________________________
Block8_6_Branch_0_Conv2d_1x1_Ac (None, 3, 3, 192) 0 Block8_6_Branch_0_Conv2d_1x1_Batc
__________________________________________________________________________________________________
Block8_6_Branch_1_Conv2d_0c_3x1 (None, 3, 3, 192) 0 Block8_6_Branch_1_Conv2d_0c_3x1_B
__________________________________________________________________________________________________
Block8_6_Concatenate (Concatena (None, 3, 3, 384) 0 Block8_6_Branch_0_Conv2d_1x1_Acti
Block8_6_Branch_1_Conv2d_0c_3x1_A
__________________________________________________________________________________________________
Block8_6_Conv2d_1x1 (Conv2D) (None, 3, 3, 1792) 689920 Block8_6_Concatenate[0][0]
__________________________________________________________________________________________________
Block8_6_ScaleSum (Lambda) (None, 3, 3, 1792) 0 Block8_5_Activation[0][0]
Block8_6_Conv2d_1x1[0][0]
__________________________________________________________________________________________________
AvgPool (GlobalAveragePooling2D (None, 1792) 0 Block8_6_ScaleSum[0][0]
__________________________________________________________________________________________________
Dropout (Dropout) (None, 1792) 0 AvgPool[0][0]
__________________________________________________________________________________________________
Bottleneck (Dense) (None, 128) 229376 Dropout[0][0]
__________________________________________________________________________________________________
Bottleneck_BatchNorm (BatchNorm (None, 128) 384 Bottleneck[0][0]
==================================================================================================
Total params: 22,808,144
Trainable params: 22,779,312
Non-trainable params: 28,832
__________________________________________________________________________________________________
/home/leon/anaconda3/lib/python3.7/site-packages/keras/engine/saving.py:341: UserWarning: No training configuration found in save file: the model was *not* compiled. Compile it manually.
warnings.warn('No training configuration found in save file: '
# TODO:使用 load_model 從`./models/facenet_keras.h5` 加載模型
from keras.models import load_model
# 模型底層使用 FaceNet 對原始圖像進行建模,特徵抽取
# 加載預訓練好的 FaceNet 模型。
facenet_model = load_model('./models/facenet_keras.h5')
# TODO:自己定義模型頂層,使用抽取後的特徵進行人臉識別
facenet_face = Sequential()
#facenet_face.add(Dense(1024, input_shape=facenet_model.output_shape, activation="relu"))
#facenet_face.add(Dropout(0.5))
facenet_face.add(Dense(5, activation='softmax'))
facenet_face_model = Model(inputs=facenet_model.input, outputs=facenet_face(facenet_model.output))
facenet_face_model.summary()
Model: "model_42"
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_1 (InputLayer) (None, 160, 160, 3) 0
__________________________________________________________________________________________________
Conv2d_1a_3x3 (Conv2D) (None, 79, 79, 32) 864 input_1[0][0]
__________________________________________________________________________________________________
Conv2d_1a_3x3_BatchNorm (BatchN (None, 79, 79, 32) 96 Conv2d_1a_3x3[0][0]
__________________________________________________________________________________________________
...
__________________________________________________________________________________________________
Block8_6_ScaleSum (Lambda) (None, 3, 3, 1792) 0 Block8_5_Activation[0][0]
Block8_6_Conv2d_1x1[0][0]
__________________________________________________________________________________________________
AvgPool (GlobalAveragePooling2D (None, 1792) 0 Block8_6_ScaleSum[0][0]
__________________________________________________________________________________________________
Dropout (Dropout) (None, 1792) 0 AvgPool[0][0]
__________________________________________________________________________________________________
Bottleneck (Dense) (None, 128) 229376 Dropout[0][0]
__________________________________________________________________________________________________
Bottleneck_BatchNorm (BatchNorm (None, 128) 384 Bottleneck[0][0]
__________________________________________________________________________________________________
sequential_21 (Sequential) (None, 5) 645 Bottleneck_BatchNorm[0][0]
==================================================================================================
Total params: 22,808,789
Trainable params: 22,779,957
Non-trainable params: 28,832
__________________________________________________________________________________________________
/home/leon/anaconda3/lib/python3.7/site-packages/keras/engine/saving.py:341: UserWarning: No training configuration found in save file: the model was *not* compiled. Compile it manually.
warnings.warn('No training configuration found in save file: '
train_XX = []
for x in train_X:
x_image = Image.fromarray(x)
x_image = x_image.resize((160, 160))
train_XX.append(np.asarray(x_image))
train_X2 = np.array(train_XX)
train_X2[0].shape
(160, 160, 3)
## 編譯模型
facenet_face_model.compile(optimizer='rmsprop', loss='categorical_crossentropy', metrics=['accuracy'])
# 模型訓練
facenet_face_model.fit(train_X2, train_Y, batch_size=8, epochs=50)
Epoch 1/50
279/279 [==============================] - 29s 106ms/step - loss: 0.7062 - accuracy: 0.7491
Epoch 2/50
279/279 [==============================] - 18s 64ms/step - loss: 0.3518 - accuracy: 0.8781
...
Epoch 49/50
279/279 [==============================] - 18s 64ms/step - loss: 0.0768 - accuracy: 0.9749
Epoch 50/50
279/279 [==============================] - 18s 64ms/step - loss: 0.0727 - accuracy: 0.9821
<keras.callbacks.callbacks.History at 0x7f1188d7f160>
test_XX = []
for x in test_X:
x_image = Image.fromarray(x)
x_image = x_image.resize((160, 160))
test_XX.append(np.asarray(x_image))
test_X2 = np.array(test_XX)
test_X2[0].shape
preds_Y = facenet_face_model.predict(test_X2)
correct = 0.
for pr, y in zip(preds_Y, test_Y):
pr_cls = np.argmax(pr)
if y[pr_cls] == 1:
correct += 1
accuracy = correct / len(preds_Y)
print("FaceNet Accuracy: %.3f" % accuracy)
FaceNet Accuracy: 0.960
問題7: 評價 FaceNet 模型的效果,並指出爲什麼 FaceNet 比上一小節中 ResNet 的效果要好。
回答問題:
FaceNet收斂更快,其模型參數專門針對人臉訓練過;
問題8:
- 首先你需要在下放畫一個表格,將上面所做的實驗結果都列出來。
- 然後總結此項目,你認爲在這個人臉識別項目中,哪些技術對識別準確率起比較重要的作用?請結合以上的實驗結果分析。
- 最後,你再簡要說說還有哪些技術對人臉識別任務有較大的幫助?列出你的references
回答問題:
實驗結果總結
Models | Accuracy |
---|---|
CNN | #0.44 |
ResNet50 no-pretrain | #0.60 |
ResNet50 pretrain | #0.56 |
FaceNet | #0.96 |
人臉識別項目中,對識別準確率起比較重要的作用的:
1.人臉檢測,準確的人臉檢測;
2.網絡模型設計,更深,更復雜的網絡,更有利於特徵提取;
3.訓練數據,結果數據增強的數據集,訓練效果更好;
references: