TensorFlow 实现VGG16图像分类

1. vgg16.py 构建模型

VGG16图像分类基于tensorflow实现，主要包含以下四个程序：

vgg16.py：读入模型参数构建模型
utils.py：读入图片，概率显示
nclasses.py：含labels字典
app.py：应用程序，实现图像识别

1. vgg16.py 构建模型

程序结构如下：

（1） init

加载网络参数到data_dict

def __init__(self, vgg16_path=None):
        if vgg16_path is None:
            vgg16_path = os.path.join(os.getcwd(), "vgg16.npy") 
            self.data_dict = np.load(vgg16_path, encoding='latin1').item()

字典key的列表如下所示，分别对应13个卷积层以及3个全连接层的参数 $W$ 和偏置 $b$ 。

['conv1_1_W', 'conv1_1_b',
 'conv1_2_W', 'conv1_2_b',
 'conv2_1_W', 'conv2_1_b',
 'conv2_2_W', 'conv2_2_b',
 'conv3_1_W', 'conv3_1_b',
 'conv3_2_W', 'conv3_2_b',
 'conv3_3_W', 'conv3_3_b',
 'conv4_1_W', 'conv4_1_b',
 'conv4_2_W', 'conv4_2_b',
 'conv4_3_W', 'conv4_3_b',
 'conv5_1_W', 'conv5_1_b',
 'conv5_2_W', 'conv5_2_b',
 'conv5_3_W', 'conv5_3_b',
 'fc6_W', 'fc6_b',
 'fc7_W', 'fc7_b',
 'fc8_W', 'fc8_b']

（2） forward

复现网络结构

def forward(self, images):
        rgb_scaled = images * 255.0 
        #RGB 转化为 BGR格式
        red, green, blue = tf.split(rgb_scaled,3,3) 
        bgr = tf.concat([     
            blue - VGG_MEAN[0],
            green - VGG_MEAN[1],
            red - VGG_MEAN[2]],3)
        
        self.conv1_1 = self.conv_layer(bgr, "conv1_1") 
        self.conv1_2 = self.conv_layer(self.conv1_1, "conv1_2")
        self.pool1 = self.max_pool_2x2(self.conv1_2, "pool1")
        
        self.conv2_1 = self.conv_layer(self.pool1, "conv2_1")
        self.conv2_2 = self.conv_layer(self.conv2_1, "conv2_2")
        self.pool2 = self.max_pool_2x2(self.conv2_2, "pool2")

        self.conv3_1 = self.conv_layer(self.pool2, "conv3_1")
        self.conv3_2 = self.conv_layer(self.conv3_1, "conv3_2")
        self.conv3_3 = self.conv_layer(self.conv3_2, "conv3_3")
        self.pool3 = self.max_pool_2x2(self.conv3_3, "pool3")
        
        self.conv4_1 = self.conv_layer(self.pool3, "conv4_1")
        self.conv4_2 = self.conv_layer(self.conv4_1, "conv4_2")
        self.conv4_3 = self.conv_layer(self.conv4_2, "conv4_3")
        self.pool4 = self.max_pool_2x2(self.conv4_3, "pool4")
        
        self.conv5_1 = self.conv_layer(self.pool4, "conv5_1")
        self.conv5_2 = self.conv_layer(self.conv5_1, "conv5_2")
        self.conv5_3 = self.conv_layer(self.conv5_2, "conv5_3")
        self.pool5 = self.max_pool_2x2(self.conv5_3, "pool5")
        
        self.fc6 = self.fc_layer(self.pool5, "fc6") 
        self.relu6 = tf.nn.relu(self.fc6) 
        
        self.fc7 = self.fc_layer(self.relu6, "fc7")
        self.relu7 = tf.nn.relu(self.fc7)
        
        self.fc8 = self.fc_layer(self.relu7, "fc8")
        self.prob = tf.nn.softmax(self.fc8, name="prob")

        self.data_dict = None

注：需要将图片由RGB 转化为BGR格式，这主要因为opencv默认通道是bgr的，这是为兼容某些硬件的遗留问题。

RGB代表红绿蓝。R在高位，G在中间，B在低位。
BGR是相同的，除了区域顺序颠倒。

卷积层

def conv_layer(self, x, name):
       with tf.variable_scope(name): 
           w = self.get_conv_filter(name) 
           conv = tf.nn.conv2d(x, w, [1, 1, 1, 1], padding='SAME') 
           conv_biases = self.get_bias(name) 
           result = tf.nn.relu(tf.nn.bias_add(conv, conv_biases)) 
           return result

池化层

 def max_pool_2x2(self, x, name):
       return tf.nn.max_pool(x, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME', name=name)

全连接层

  def fc_layer(self, x, name):
      with tf.variable_scope(name): 
          shape = x.get_shape().as_list() 
          dim = 1
          for i in shape[1:]:
              dim *= i 
          x = tf.reshape(x, [-1, dim])
          w = self.get_fc_weight(name) 
          b = self.get_bias(name) 
              
          result = tf.nn.bias_add(tf.matmul(x, w), b) 
          return result

2. utils.py 处理图片

将图片处理称为 $1 \times 224 \times 224 \times 3$ 格式

3. nclasses.py 字典

格式如下：

 0: 'tench\n Tinca tinca',
 1: 'goldfish\n Carassius auratus',
 2: 'great white shark\n white shark\n man-eater\n man-eating shark\n Carcharodon carcharias',
 3: 'tiger shark\n Galeocerdo cuvieri',
 4: 'hammerhead\n hammerhead shark',
 5: 'electric ray\n crampfish\n numbfish\n torpedo',

4. app.py 主应用程序

识别程序如下：

with tf.Session() as sess:
    images = tf.placeholder(tf.float32, [1, 224, 224, 3])
    #通过vgg16的初始化函数 实例化vgg，读出了保存在npy文件中的模型参数
    vgg = vgg16.Vgg16() 
    vgg.forward(images) #复现神经网络结构
    # 得出1000个分类的概率分布
    probability = sess.run(vgg.prob, feed_dict={images:img_ready})
    #概率最高的5个 概率索引值存入top5
    top5 = np.argsort(probability[0])[-1:-6:-1]
    print("top5:",top5)
    values = []
    bar_label = []  #标签字典对应的值 5个物种的名称
    for n, i in enumerate(top5): 
        print("n:",n)
        print("i:",i)
        values.append(probability[0][i]) 
        bar_label.append(labels[i]) 
        print(i, ":", labels[i], "----", utils.percent(probability[0][i]) )

关注公众号机器学习Zero，回复模型，下载源代码、测试图片及VGG16的模型。

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

TensorFlow 实现VGG16图像分类

TensorFlow 实现VGG16图像分类

1. vgg16.py 构建模型

（1） init

（2） forward

2. utils.py 处理图片

3. nclasses.py 字典

4. app.py 主应用程序

Python錯誤： NameError

TensorFlow 實現VGG16圖像分類

DeepLearning-L7-GoogLeNet

DeepLearning-L5-AlexNet

DeepLearning-L4-LeNet5

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結

TensorFlow 实现VGG16图像分类

TensorFlow 实现VGG16图像分类

1. vgg16.py 构建模型

（1） __init __

（2） forward

2. utils.py 处理图片

3. nclasses.py 字典

4. app.py 主应用程序

（1） init