TensorFlow 實現VGG16圖像分類

1. vgg16.py 構建模型

VGG16圖像分類基於tensorflow實現，主要包含以下四個程序：

vgg16.py：讀入模型參數構建模型
utils.py：讀入圖片，概率顯示
nclasses.py：含labels字典
app.py：應用程序，實現圖像識別

1. vgg16.py 構建模型

程序結構如下：

（1） init

加載網絡參數到data_dict

def __init__(self, vgg16_path=None):
        if vgg16_path is None:
            vgg16_path = os.path.join(os.getcwd(), "vgg16.npy") 
            self.data_dict = np.load(vgg16_path, encoding='latin1').item()

字典key的列表如下所示，分別對應13個卷積層以及3個全連接層的參數 $W$ 和偏置 $b$ 。

['conv1_1_W', 'conv1_1_b',
 'conv1_2_W', 'conv1_2_b',
 'conv2_1_W', 'conv2_1_b',
 'conv2_2_W', 'conv2_2_b',
 'conv3_1_W', 'conv3_1_b',
 'conv3_2_W', 'conv3_2_b',
 'conv3_3_W', 'conv3_3_b',
 'conv4_1_W', 'conv4_1_b',
 'conv4_2_W', 'conv4_2_b',
 'conv4_3_W', 'conv4_3_b',
 'conv5_1_W', 'conv5_1_b',
 'conv5_2_W', 'conv5_2_b',
 'conv5_3_W', 'conv5_3_b',
 'fc6_W', 'fc6_b',
 'fc7_W', 'fc7_b',
 'fc8_W', 'fc8_b']

（2） forward

復現網絡結構

def forward(self, images):
        rgb_scaled = images * 255.0 
        #RGB 轉化爲 BGR格式
        red, green, blue = tf.split(rgb_scaled,3,3) 
        bgr = tf.concat([     
            blue - VGG_MEAN[0],
            green - VGG_MEAN[1],
            red - VGG_MEAN[2]],3)
        
        self.conv1_1 = self.conv_layer(bgr, "conv1_1") 
        self.conv1_2 = self.conv_layer(self.conv1_1, "conv1_2")
        self.pool1 = self.max_pool_2x2(self.conv1_2, "pool1")
        
        self.conv2_1 = self.conv_layer(self.pool1, "conv2_1")
        self.conv2_2 = self.conv_layer(self.conv2_1, "conv2_2")
        self.pool2 = self.max_pool_2x2(self.conv2_2, "pool2")

        self.conv3_1 = self.conv_layer(self.pool2, "conv3_1")
        self.conv3_2 = self.conv_layer(self.conv3_1, "conv3_2")
        self.conv3_3 = self.conv_layer(self.conv3_2, "conv3_3")
        self.pool3 = self.max_pool_2x2(self.conv3_3, "pool3")
        
        self.conv4_1 = self.conv_layer(self.pool3, "conv4_1")
        self.conv4_2 = self.conv_layer(self.conv4_1, "conv4_2")
        self.conv4_3 = self.conv_layer(self.conv4_2, "conv4_3")
        self.pool4 = self.max_pool_2x2(self.conv4_3, "pool4")
        
        self.conv5_1 = self.conv_layer(self.pool4, "conv5_1")
        self.conv5_2 = self.conv_layer(self.conv5_1, "conv5_2")
        self.conv5_3 = self.conv_layer(self.conv5_2, "conv5_3")
        self.pool5 = self.max_pool_2x2(self.conv5_3, "pool5")
        
        self.fc6 = self.fc_layer(self.pool5, "fc6") 
        self.relu6 = tf.nn.relu(self.fc6) 
        
        self.fc7 = self.fc_layer(self.relu6, "fc7")
        self.relu7 = tf.nn.relu(self.fc7)
        
        self.fc8 = self.fc_layer(self.relu7, "fc8")
        self.prob = tf.nn.softmax(self.fc8, name="prob")

        self.data_dict = None

注：需要將圖片由RGB 轉化爲BGR格式，這主要因爲opencv默認通道是bgr的，這是爲兼容某些硬件的遺留問題。

RGB代表紅綠藍。R在高位，G在中間，B在低位。
BGR是相同的，除了區域順序顛倒。

卷積層

def conv_layer(self, x, name):
       with tf.variable_scope(name): 
           w = self.get_conv_filter(name) 
           conv = tf.nn.conv2d(x, w, [1, 1, 1, 1], padding='SAME') 
           conv_biases = self.get_bias(name) 
           result = tf.nn.relu(tf.nn.bias_add(conv, conv_biases)) 
           return result

池化層

 def max_pool_2x2(self, x, name):
       return tf.nn.max_pool(x, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME', name=name)

全連接層

  def fc_layer(self, x, name):
      with tf.variable_scope(name): 
          shape = x.get_shape().as_list() 
          dim = 1
          for i in shape[1:]:
              dim *= i 
          x = tf.reshape(x, [-1, dim])
          w = self.get_fc_weight(name) 
          b = self.get_bias(name) 
              
          result = tf.nn.bias_add(tf.matmul(x, w), b) 
          return result

2. utils.py 處理圖片

將圖片處理稱爲 $1 \times 224 \times 224 \times 3$ 格式

3. nclasses.py 字典

格式如下：

 0: 'tench\n Tinca tinca',
 1: 'goldfish\n Carassius auratus',
 2: 'great white shark\n white shark\n man-eater\n man-eating shark\n Carcharodon carcharias',
 3: 'tiger shark\n Galeocerdo cuvieri',
 4: 'hammerhead\n hammerhead shark',
 5: 'electric ray\n crampfish\n numbfish\n torpedo',

4. app.py 主應用程序

識別程序如下：

with tf.Session() as sess:
    images = tf.placeholder(tf.float32, [1, 224, 224, 3])
    #通過vgg16的初始化函數 實例化vgg，讀出了保存在npy文件中的模型參數
    vgg = vgg16.Vgg16() 
    vgg.forward(images) #復現神經網絡結構
    # 得出1000個分類的概率分佈
    probability = sess.run(vgg.prob, feed_dict={images:img_ready})
    #概率最高的5個 概率索引值存入top5
    top5 = np.argsort(probability[0])[-1:-6:-1]
    print("top5:",top5)
    values = []
    bar_label = []  #標籤字典對應的值 5個物種的名稱
    for n, i in enumerate(top5): 
        print("n:",n)
        print("i:",i)
        values.append(probability[0][i]) 
        bar_label.append(labels[i]) 
        print(i, ":", labels[i], "----", utils.percent(probability[0][i]) )

關注公衆號機器學習Zero，回覆模型，下載源代碼、測試圖片及VGG16的模型。

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

TensorFlow 實現VGG16圖像分類

TensorFlow 實現VGG16圖像分類

1. vgg16.py 構建模型

（1） init

（2） forward

2. utils.py 處理圖片

3. nclasses.py 字典

4. app.py 主應用程序

DAPPER 事務 TRANSACTION

Python錯誤： NameError

TensorFlow 實現VGG16圖像分類

DeepLearning-L7-GoogLeNet

DeepLearning-L5-AlexNet

DeepLearning-L4-LeNet5

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結

TensorFlow 實現VGG16圖像分類

TensorFlow 實現VGG16圖像分類

1. vgg16.py 構建模型

（1） __init __

（2） forward

2. utils.py 處理圖片

3. nclasses.py 字典

4. app.py 主應用程序

（1） init