狗的品種識別實戰（tf2.0）

在深度學習之貓VS狗中，學習了貓和狗識別的分類算法，這能很好的區分貓和狗，那如果我們想做貓的品種識別或者狗的品種識別呢？比如給一隻狗的圖片，我們想知道它屬於鬥牛犬，還是柯基，還是中華田園犬？
很容易想到，用貓狗識別的網絡肯定過於簡單了，因爲貓和狗的特徵區別較大，所以網絡層次不用很深也可以實現，但是同樣是狗的種類，可能有的品種之間特徵區別較小，所以我們需要更深層的網絡來進行特徵提取。所以我感覺用resnet50可能效果會比較好，這篇文章中就使用resnet50結合vgg16來做狗的品種識別。這個模型很強大，訓練才7輪，訓練集的準確率就有99.27了，驗證集的精確率也有66

注意：文章所用框架爲tensorflow2.0，這個框架很簡單，適合入門。所以建議先百度安裝tensorflow2.0。

先看下自己百度找了幾張圖來用模型預測的結果：

百度圖片來測試，測試完了才知道二哈是個雪橇犬，哈哈哈。測試了五張圖有四張正確，柴犬預測錯誤了。

文章的構架如下：

數據獲取
數據處理
模型導入
訓練與預測
完整代碼

1. 數據獲取

文章中使用kaggle中的數據集，官方下載地址，百度雲網盤下載地址，提取碼：xj81 。這個數據集中有120個狗的類別。
數據集有一個訓練集，一個測試集，一個訓練集的csv文件，csv文件包含圖片名與類別的一一對應。

爲了簡單起見，文章中只用到了訓練集，將訓練集拆分爲訓練集與測試集。所以只需要下載train.zip與label.csv即可。

2.數據處理

數據的處理是最麻煩，對新手最不友好的部分，耐心看下去一定會有收穫的。

獲取數據列表

在tensorflow2.0中，我比較習慣將數據做成dataset，很方便導入，並且在換框架（pytorch）時，稍加改動即可使用。
先看代碼，讀取csv後，取訓練集總數的0.9作爲訓練，其餘的數據做驗證。這裏需要將label從名字改爲序列號（代碼中的breeds_map與train_label_list）。

import os

df_train = pd.read_csv("/content/gdrive/My Drive/dogs_breeds/labels.csv")
# 列舉所有種類，這裏的set方法可以做集合，使同一個品種只出現一次。
breeds_map = list(set(df_train['breed']))
# breeds_map = ['old_english_sheepdog', ...,'norfolk_terrier', 'silky_terrier', 'cardigan', 'otterhound']

# 生成訓練集的名字列表
filename_list = df_train['id']
# 將label從名字改爲序列號
label_list = [breeds_map.index(label_name) for label_name in df_train['breed']]
filename_list = ['/content/gdrive/My Drive/dogs_breeds/train/train/{}.jpg'.format(x) for x in filename_list]

weight_path = os.path.join("gdrive", "My Drive", "dogs_breeds", "weights")+"/"+'mymodel.ckpt'

生成dataset

先定義讀數據的函數preprocess_for_train與preprocess_for_val，
然後用tensorflow的tf.data.Dataset.from_tensor_slices生成圖片名字的dataset，這裏不做圖片數據的dataset是爲了節省內存。用map方法作用讀數據的函數，用batch方法設置批量。
先看代碼，其中的tf.io.read_file是讀取數據，decode_jpeg是將數據轉化爲可用的格式，最後 /255.0是爲了做歸一化，這是最基礎的處理，還可以給圖片做增強，提高精度。

def preprocess_for_train(image_path, label):
  # 用tf讀圖，並做resize，以及歸一化
  image_string = tf.io.read_file(image_path)
  image_decoded = tf.image.decode_jpeg(image_string)
  image_resized = tf.image.resize(image_decoded, [224, 224]) / 255.0

  return image_resized, label

def preprocess_for_val(image_path, label):
  # 用tf讀圖，並做resize，以及歸一化
  image_string = tf.io.read_file(image_path)
  image_decoded = tf.image.decode_jpeg(image_string)
  image_resized = tf.image.resize(image_decoded, [224, 224]) / 255.0

  return image_resized, label 

train_num = int(len(filename_list)*0.9)
train_dataset = tf.data.Dataset.from_tensor_slices((filename_list[:train_num], label_list[:train_num]))
tf.random.set_seed(1) # 設置隨機種子，以後跑的時候固定數據的順序
train_dataset = train_dataset.shuffle(len(filename_list[:train_num])).map(preprocess_for_train).batch(batch_size)

val_dataset = tf.data.Dataset.from_tensor_slices((filename_list[train_num:], label_list[train_num:]))
tf.random.set_seed(2)
val_dataset = val_dataset.shuffle(len(filename_list[train_num:])).map(preprocess_for_train).batch(batch_size)

查看數據集

可以選擇25個數據來查看一下我們的數據集。

# 查看25個狗狗的品種，batchsize設置越大，這裏讀的越慢，所以這一段可以註釋掉。
for image, label in train_dataset.take(1):
  print(image.shape)
  # 若批量設置大於25的話，取這一批的前25個數據。若小於25，則設置爲n*n個。
  n=5
  image, label = image[:n*n], label[:n*n]  
  plt.figure(figsize=(10,10))
  for i in range(n*n):
      plt.subplot(n,n,i+1)
      plt.xticks([])
      plt.yticks([])
      plt.grid(False)
      plt.imshow(image[i], cmap=plt.cm.binary)
      plt.xlabel(breeds_map[label[i]])
  plt.show()

3.模型導入

導入resnet50與vgg16

前面說了，文章中用到的模型是resnet50與vgg16，這個模型可以直接從tf導入，不用自己編寫。
模型導入可以看另一篇博客。這裏就不再敘述，直接調包。

import tensorflow as tf
from tensorflow.keras.layers import Dropout, Input, concatenate, GlobalAveragePooling2D, Conv2D, BatchNormalization,MaxPooling2D, Activation, Flatten, Dense

vgg16 = tf.keras.applications.vgg16.VGG16(weights='imagenet', include_top=False, input_tensor=Input(
    shape=(image_shape[0], image_shape[1], 3)), classes=n_class)
res50 = tf.keras.applications.resnet50.ResNet50(weights='imagenet', include_top=False, input_tensor=Input(
    shape=(image_shape[0], image_shape[1], 3)), classes=n_class)

結合兩個模型

class MyModel(tf.keras.Model):
    def __init__(self, n_class=2):
        super().__init__()
        self.vgg16_model = vgg16
        self.res50_model = res50
        self.global_pool = GlobalAveragePooling2D()
        self.conv_vgg = Dense(512/4, use_bias=False, kernel_initializer='uniform')
        self.conv_res = Dense(2048/4, use_bias=False, kernel_initializer='uniform')
        self.batch_normalize = BatchNormalization()
        self.batch_normalize_res = BatchNormalization()
        self.relu = Activation("relu")
        self.concat = concatenate
        self.dropout_1 = Dropout(0.3)
        self.conv_1 = Dense(640, use_bias=False, kernel_initializer='uniform')
        self.batch_normalize_1 = BatchNormalization()
        self.relu_1 = Activation("relu")
        self.dropout_2 = Dropout(0.5)
        self.classify = Dense(n_class, kernel_initializer='uniform', activation="softmax")

def call(self, input):
      x_vgg16 = self.vgg16_model(input)
      x_vgg16 = self.global_pool(x_vgg16)
      x_vgg16 = self.conv_vgg(x_vgg16)
      x_vgg16 = self.batch_normalize(x_vgg16)
      x_vgg16 = self.relu(x_vgg16)
      x_res50 = self.res50_model(input)
      x_res50 = self.global_pool(x_res50)
      x_res50 = self.conv_res(x_res50)
      x_res50 = self.batch_normalize_res(x_res50)
      x_res50 = self.relu(x_res50)
      x = self.concat([x_vgg16, x_res50])
      x = self.dropout_1(x)
      x = self.conv_1(x)
      x = self.batch_normalize_1(x)
      x = self.relu_1(x)
      x = self.dropout_2(x)
      x = self.classify(x)

      return x

4.模型訓練與預測

模型的訓練沒什麼好說的，直接上代碼。
模型compile過程中的loss需要特別注意，不同的loss函數對label有不同的需求，loss函數的輸入爲預測值與label。預測值是統一的爲[0.1,0.4,0.1,0.1…]這樣的概率列表，列表的長度爲模型的類別數量。有的loss需要label爲5這樣數字（表示第6類，文章中的loss函數爲這一類）。有的loss函數需要 label要爲[0,1,0,0,0…]（表示第2類）這樣的列表。

import tensorflow as tf
from tensorflow.keras.callbacks import ModelCheckpoint

model = MyModel(n_class)
# 加載之前訓練過的模型可以加快收斂速度，第一次訓練要註釋掉。
# model.load_weights(weight_path)

optimizer = tf.keras.optimizers.SGD(learning_rate=learning_rate, decay=decay_rate)

# 生成checkpoint,這個是保存模型參數的工具，在fit的callbacks中調用
checkpoint_callback = ModelCheckpoint(
    weight_path, monitor='val_accuracy', verbose=1,
    save_best_only=False, save_weights_only=True,
    save_frequency=1)

model.compile(
    optimizer=optimizer,
    loss=tf.keras.losses.sparse_categorical_crossentropy,
    metrics=[tf.metrics.SparseCategoricalAccuracy()]
)
model.fit(train_dataset, validation_data=val_dataset,epochs=num_epochs,callbacks=[checkpoint_callback])

5.完整代碼

因爲我是在colab上做代碼調試，所以沒有做成本地的框架，若感興趣可以自己調整一下框架。

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import time
from datetime import timedelta
import math
import os
import scipy.misc
import PIL.Image

# 讀csv
df_train = pd.read_csv("/content/gdrive/My Drive/dogs_breeds/labels.csv")
# 列舉所有種類
breeds_map = list(set(df_train['breed']))

######### 加載數據
#####讀數據
filename_list = df_train['id']
# 將label從名字改爲序列號
label_list = [breeds_map.index(label_name) for label_name in df_train['breed']]
filename_list = ['/content/gdrive/My Drive/dogs_breeds/train/train/{}.jpg'.format(x) for x in filename_list]


##### 生成dataset
def preprocess_for_train(image_path, label):
  # 用tf讀圖，並做resize，以及歸一化
  image_string = tf.io.read_file(image_path)
  image_decoded = tf.image.decode_jpeg(image_string)
  image_resized = tf.image.resize(image_decoded, [224, 224]) / 255.0

  return image_resized, label

def preprocess_for_val(image_path, label):
  # 用tf讀圖，並做resize，以及歸一化
  image_string = tf.io.read_file(image_path)
  image_decoded = tf.image.decode_jpeg(image_string)
  image_resized = tf.image.resize(image_decoded, [224, 224]) / 255.0

  return image_resized, label 

# 生成dataset
train_num = int(len(filename_list)*0.9)
train_dataset = tf.data.Dataset.from_tensor_slices((filename_list[:train_num], label_list[:train_num]))
tf.random.set_seed(1) # 設置隨機種子，以後跑的時候固定數據的順序
train_dataset = train_dataset.shuffle(len(filename_list[:train_num])).map(preprocess_for_train).batch(batch_size)

val_dataset = tf.data.Dataset.from_tensor_slices((filename_list[train_num:], label_list[train_num:]))
tf.random.set_seed(2)
val_dataset = val_dataset.shuffle(len(filename_list[train_num:])).map(preprocess_for_train).batch(batch_size)

######### 數據加載完成

######## 配置訓練參數
import os
weight_path = os.path.join("gdrive", "My Drive", "dogs_breeds", "weights")+"/mymodel.ckpt"

num_epochs = 10
learning_rate = 0.01
decay_rate = 1e-6 # 學習率衰減，每輪減少學習率的值
image_shape = (224, 224)
batch_size=40  #一次訓練多少張圖，越大越快，但是對系統內存要求越高,報內存錯誤調低這個值，最小可以到1
n_class = len(breeds_map)
print("有{}個種類".format(n_class))
######## 配置訓練參數完成

########## 數據查看
# 查看25個狗狗的品種，batchsize設置越大，這裏讀的越慢，所以這一段可以註釋掉。
for image, label in train_dataset.take(1):
  print(image.shape)
  # 若批量設置大於25的話，取這一批的前25個數據。若小於25，則設置爲n*n個。
  n=5
  image, label = image[:n*n], label[:n*n]  
  plt.figure(figsize=(10,10))
  for i in range(n*n):
      plt.subplot(n,n,i+1)
      plt.xticks([])
      plt.yticks([])
      plt.grid(False)
      plt.imshow(image[i], cmap=plt.cm.binary)
      plt.xlabel(breeds_map[label[i]])
  plt.show()
#########數據查看部分完成

########模型生成部分
#### 模型導入
vgg16 = tf.keras.applications.vgg16.VGG16(weights='imagenet', include_top=False, input_tensor=Input(
    shape=(image_shape[0], image_shape[1], 3)), classes=n_class)
res50 = tf.keras.applications.resnet50.ResNet50(weights='imagenet', include_top=False, input_tensor=Input(
    shape=(image_shape[0], image_shape[1], 3)), classes=n_class)

#### 模型組合
class MyModel(tf.keras.Model):
    def __init__(self, n_class=2):
        super().__init__()
        self.vgg16_model = vgg16
        self.res50_model = res50
        self.global_pool = GlobalAveragePooling2D()
        self.conv_vgg = Dense(512/4, use_bias=False, kernel_initializer='uniform')
        self.conv_res = Dense(2048/4, use_bias=False, kernel_initializer='uniform')
        self.batch_normalize = BatchNormalization()
        self.batch_normalize_res = BatchNormalization()
        self.relu = Activation("relu")
        self.concat = concatenate
        self.dropout_1 = Dropout(0.3)
        self.conv_1 = Dense(640, use_bias=False, kernel_initializer='uniform')
        self.batch_normalize_1 = BatchNormalization()
        self.relu_1 = Activation("relu")
        self.dropout_2 = Dropout(0.3)
        self.classify = Dense(n_class, kernel_initializer='uniform', activation="softmax")

    def call(self, input):
      x_vgg16 = self.vgg16_model(input)
      x_vgg16 = self.global_pool(x_vgg16)
      x_vgg16 = self.conv_vgg(x_vgg16)
      x_vgg16 = self.batch_normalize(x_vgg16)
      x_vgg16 = self.relu(x_vgg16)
      x_res50 = self.res50_model(input)
      x_res50 = self.global_pool(x_res50)
      x_res50 = self.conv_res(x_res50)
      x_res50 = self.batch_normalize_res(x_res50)
      x_res50 = self.relu(x_res50)
      x = self.concat([x_vgg16, x_res50])
      x = self.dropout_1(x)
      x = self.conv_1(x)
      x = self.batch_normalize_1(x)
      x = self.relu_1(x)
      x = self.dropout_2(x)
      x = self.classify(x)

      return x

########模型導入部分完成

######### 模型訓練部分
import tensorflow as tf
from tensorflow.keras.callbacks import ModelCheckpoint

model = MyModel(n_class)
# 加載之前訓練過的模型可以加快收斂速度，第一次訓練需要註釋掉
# model.load_weights(weight_path)

optimizer = tf.keras.optimizers.SGD(learning_rate=learning_rate, decay=decay_rate)

checkpoint_callback = ModelCheckpoint(
    weight_path, monitor='val_accuracy', verbose=1,
    save_best_only=False, save_weights_only=True,
    save_frequency=1)

model.compile(
    optimizer=optimizer,
    loss=tf.keras.losses.sparse_categorical_crossentropy,
    metrics=[tf.metrics.SparseCategoricalAccuracy()]
)
model.fit(train_dataset, validation_data=val_dataset,epochs=num_epochs,callbacks=[checkpoint_callback])

######### 模型訓練完成

6.訓練結果

這個模型很強大，訓練才7輪，訓練集的準確率就有99.27了，驗證集的精確率也有66。說明模型有一點過擬合，可以減小模型參數再嘗試。

模型預測

import cv2
model.load_weights(weight_path+'mymodel_20191113.ckpt')
image_path = "fadou.png" # 輸入的圖片路徑


def preprocess_for_train(image_path):
  # 用tf讀圖，並做resize，以及歸一化
  image_string = tf.io.read_file(image_path)
  image_decoded = tf.image.decode_jpeg(image_string)
  image_resized = tf.image.resize(image_decoded, [224, 224]) / 255.0
  image_resized = tf.expand_dims(image_resized, 0)

  return image_resized


image = preprocess_for_train(image_path)
pred = model.predict(image)
pred = np.argmax(pred)
y_pred = breeds_map[pred]
print("預測結果爲{}".format(y_pred))
plt.imshow(image[0], cmap=plt.cm.binary)
plt.xlabel(image_path.split('.')[0])
plt.show()

預測結果在文章最前面，測試圖片如下：
柴犬
 二哈
 法鬥
 吉娃娃
 金毛

CreateBig

發佈了18 篇原創文章 · 獲贊 57 · 訪問量 11萬+

私信關注

狗的品種識別實戰（tf2.0）

1. 數據獲取

2.數據處理

獲取數據列表

生成dataset

查看數據集

3.模型導入

導入resnet50與vgg16

結合兩個模型

4.模型訓練與預測

5.完整代碼

6.訓練結果

杭州的 IT 崩盤了麼？

開源高性能結構化日誌模塊NanoLog

Python 潮流週刊#55：分享 9 個高質量的技術類信息源！

Azure Virtual Network (22) 多訂閱使用Azure DNS解析問題 Windows Azure Platform 系列文章目錄

【簡寫Mybatis-02】註冊機的實現以及SqlSession處理

手繪二維碼

.NET藉助虛擬網卡實現一個簡單異地組網工具

狗的品種識別實戰（tf2.0）

語義分割算法彙總（長期更新）

ubuntu遞歸改名

機器學習於深度學習數據集彙總

CNN卷積網絡簡介

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結