Inception v1 論文及源碼

Going deeper with convolutions
Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed,
Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, Andrew Rabinovich.
http://arxiv.org/pdf/1409.4842v1.pdf.

論文解讀

Network-in-network

R-CNN

R-CNN是如今主流的object detection的方法。R-CNN將整個detection問題分爲兩個子問題。首先,利用一些底層的表示,例如:顏色,像素的一致性等,來暗示提示object。然後利用CNN分類器來識別在某個位置等物體。
這兩個步驟,權衡了在顏色像素等級別對原圖像分塊和更加高層次的分類器。本文使用了類似的方法(pipeline),但是每個階段使用了一些加強的方法,例如multi-box,

The current leading approach for object detection is the Regions with Convolutional Neural Net- works (R-CNN) proposed by Girshick et al. [6]. R-CNN decomposes the overall detection problem into two subproblems: to first utilize low-level cues such as color and superpixel consistency for potential object proposals in a category-agnostic fashion, and to then use CNN classifiers to identify object categories at those locations. Such a two stage approach leverages the accuracy of bound- ing box segmentation with low-level cues, as well as the highly powerful classification power of state-of-the-art CNNs. We adopted a similar pipeline in our detection submissions, but have ex- plored enhancements in both stages, such as multi-box [5] prediction for higher object bounding box recall, and ensemble approaches for better categorization of bounding box proposals.

Multivation and High Level Considerations

雖說最直接的提高網絡性能的方法是增加寬度和深度,但是這兩個方法有drawback,缺點。更多的參數導致容易過擬合,並且,消耗更多的計算資源。

For example, in a deep vision network, if two convolutional layers are chained, any uniform increase in the number of their filters results in a quadratic increase of computation.

解決這兩個問題,可以將全聯接層轉換爲稀疏鏈接層,即使是在CNN裏。
主要思想就是,將有output 有關聯的layer連起來。

if the probability distribution of the data-set is representable by a large, very sparse deep neural network, then the optimal network topology can be constructed layer by layer by analyzing the correlation statistics of the activations of the last layer and clustering neurons with highly correlated outputs.

Although the strict math- ematical proof requires very strong conditions, the fact that this statement resonates with the well known Hebbian principle – neurons that fire together, wire together – suggests that the underlying idea is applicable even under less strict conditions, in practice.

但是現在的硬件並不適合sparse網絡,稀疏並不適合parallel computing。

The vast literature on sparse matrix computations (e.g. [3]) suggests that clustering sparse matrices into relatively dense submatrices tends to give state of the art practical performance for sparse matrix multiplication.

Architectural Details

Inception 結構基於這樣一個想法:一個最優化的稀疏的CNN能多大程度上接近dense components。

The main idea of the Inception architecture is based on finding out how an optimal local sparse structure in a convolutional vision network can be approximated and covered by readily available dense components.

假設 translation 不變性意味着使用卷積來構建模型。
Note that assuming translation invariance means that our network will be built from convolutional building blocks.

我們需要的是找到這個最優的local construction,然後在空間上重複使用它。
ll we need is to find the optimal local construction and to repeat it spatially.

Arora et al. [2] suggests a layer-by layer construction in which one should analyze the correlation statistics of the last layer and cluster them into groups of units with high correlation.

我們假設這些 earlier 層的unit與輸入圖片等一些區域對應。
We assume that each unit from the earlier layer corresponds to some region of the input image and these units are grouped into filter banks.

源碼分析

函數

variable_scope

def variable_scope(name_or_scope,
                   default_name=None,
                   values=None,
                   initializer=None,
                   regularizer=None,
                   caching_device=None,
                   partitioner=None,
                   custom_getter=None,
                   reuse=None,
                   dtype=None,
                   use_resource=None,
                   constraint=None):

Args:
name_or_scope: string or VariableScope: the scope to open.
default_name: The default name to use if the name_or_scope argument is
None, this name will be uniquified. If name_or_scope is provided it
won’t be used and therefore it is not required and can be None.
values: The list of Tensor arguments that are passed to the op function.

truncated_normal_initializer
就是class TruncatedNormal(Initializer)
Initializer that generates a truncated normal distribution.
一個initializer,類似於random_normal_initializer
These values are similar to values from a random_normal_initializer
except that values more than two standard deviations from the mean
are discarded and re-drawn. This is the recommended initializer for
neural network weights and filters.

源碼

分析加在註釋中。

def inception_v1_base(inputs, final_endpoint='Mixed_5c', scope='InceptionV1'):
  """Defines the Inception V1 base architecture.

  This architecture is defined in:
    Going deeper with convolutions
    Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed,
    Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, Andrew Rabinovich.
    http://arxiv.org/pdf/1409.4842v1.pdf.

  Args:
    inputs: a tensor of size [batch_size, height, width, channels].
    final_endpoint: specifies the endpoint to construct the network up to. It
      can be one of ['Conv2d_1a_7x7', 'MaxPool_2a_3x3', 'Conv2d_2b_1x1',
      'Conv2d_2c_3x3', 'MaxPool_3a_3x3', 'Mixed_3b', 'Mixed_3c',
      'MaxPool_4a_3x3', 'Mixed_4b', 'Mixed_4c', 'Mixed_4d', 'Mixed_4e',
      'Mixed_4f', 'MaxPool_5a_2x2', 'Mixed_5b', 'Mixed_5c']
    scope: Optional variable_scope.

  Returns:
    A dictionary from components of the network to the corresponding activation.

  Raises:
    ValueError: if final_endpoint is not set to one of the predefined values.
  """
  end_points = {}
  with variable_scope.variable_scope(scope, 'InceptionV1', [inputs]):
    with arg_scope(
        [layers.conv2d, layers_lib.fully_connected],
        weights_initializer=trunc_normal(0.01)):
      with arg_scope(
          [layers.conv2d, layers_lib.max_pool2d], stride=1, padding='SAME'):
        end_point = 'Conv2d_1a_7x7'

        # inputs -> conv2d(64, [7,7]),輸入接64個[7,7]的卷積核
        # 如果輸入[224, 224, 3] -> 輸出[112, 112, 64]
        net = layers.conv2d(inputs, 64, [7, 7], stride=2, scope=end_point)
        end_points[end_point] = net
        if final_endpoint == end_point:
          return net, end_points

        # -> maxpool([3, 3],stride=2)
        # [112, 112, 64] -> [56, 56, 64]
        end_point = 'MaxPool_2a_3x3'
        net = layers_lib.max_pool2d(net, [3, 3], stride=2, scope=end_point)
        end_points[end_point] = net
        if final_endpoint == end_point:
          return net, end_points

        # -> conv2d(64, [1,1], stride=1)
        # [56, 56, 64] -> [56, 56, 64]
        end_point = 'Conv2d_2b_1x1'
        net = layers.conv2d(net, 64, [1, 1], scope=end_point)
        end_points[end_point] = net
        if final_endpoint == end_point:
          return net, end_points

        # -> conv2d(192, [3,3], stride=1)
        # -> [56, 56, 192]
        end_point = 'Conv2d_2c_3x3'
        net = layers.conv2d(net, 192, [3, 3], scope=end_point)
        end_points[end_point] = net
        if final_endpoint == end_point:
          return net, end_points

        # -> maxpool([3,3],stride=2)
        # -> [28, 28, 64]
        end_point = 'MaxPool_3a_3x3'
        net = layers_lib.max_pool2d(net, [3, 3], stride=2, scope=end_point)
        end_points[end_point] = net
        if final_endpoint == end_point:
          return net, end_points

        end_point = 'Mixed_3b'
        with variable_scope.variable_scope(end_point):
          with variable_scope.variable_scope('Branch_0'):
            # branch0 [28, 28, 64]
            branch_0 = layers.conv2d(net, 64, [1, 1], scope='Conv2d_0a_1x1')
          with variable_scope.variable_scope('Branch_1'):
            # branch1 [28, 28, 96] -> [28, 28, 128]
            branch_1 = layers.conv2d(net, 96, [1, 1], scope='Conv2d_0a_1x1')
            branch_1 = layers.conv2d(
                branch_1, 128, [3, 3], scope='Conv2d_0b_3x3')
          with variable_scope.variable_scope('Branch_2'):
            # branch2 -> 16
            branch_2 = layers.conv2d(net, 16, [1, 1], scope='Conv2d_0a_1x1')
            branch_2 = layers.conv2d(
                branch_2, 32, [3, 3], scope='Conv2d_0b_3x3')
          with variable_scope.variable_scope('Branch_3'):
            # branch3 -> 32
            branch_3 = layers_lib.max_pool2d(
                net, [3, 3], scope='MaxPool_0a_3x3')
            branch_3 = layers.conv2d(
                branch_3, 32, [1, 1], scope='Conv2d_0b_1x1')
          # net -> 64 + 128 + 16 + 32
          net = array_ops.concat([branch_0, branch_1, branch_2, branch_3], 3)
        end_points[end_point] = net
        if final_endpoint == end_point:
          return net, end_points

        end_point = 'Mixed_3c'
        with variable_scope.variable_scope(end_point):
          with variable_scope.variable_scope('Branch_0'):
            branch_0 = layers.conv2d(net, 128, [1, 1], scope='Conv2d_0a_1x1')
          with variable_scope.variable_scope('Branch_1'):
            branch_1 = layers.conv2d(net, 128, [1, 1], scope='Conv2d_0a_1x1')
            branch_1 = layers.conv2d(
                branch_1, 192, [3, 3], scope='Conv2d_0b_3x3')
          with variable_scope.variable_scope('Branch_2'):
            branch_2 = layers.conv2d(net, 32, [1, 1], scope='Conv2d_0a_1x1')
            branch_2 = layers.conv2d(
                branch_2, 96, [3, 3], scope='Conv2d_0b_3x3')
          with variable_scope.variable_scope('Branch_3'):
            branch_3 = layers_lib.max_pool2d(
                net, [3, 3], scope='MaxPool_0a_3x3')
            branch_3 = layers.conv2d(
                branch_3, 64, [1, 1], scope='Conv2d_0b_1x1')
          net = array_ops.concat([branch_0, branch_1, branch_2, branch_3], 3)
        end_points[end_point] = net
        if final_endpoint == end_point:
          return net, end_points

        end_point = 'MaxPool_4a_3x3'
        net = layers_lib.max_pool2d(net, [3, 3], stride=2, scope=end_point)
        end_points[end_point] = net
        if final_endpoint == end_point:
          return net, end_points

        end_point = 'Mixed_4b'
        with variable_scope.variable_scope(end_point):
          with variable_scope.variable_scope('Branch_0'):
            branch_0 = layers.conv2d(net, 192, [1, 1], scope='Conv2d_0a_1x1')
          with variable_scope.variable_scope('Branch_1'):
            branch_1 = layers.conv2d(net, 96, [1, 1], scope='Conv2d_0a_1x1')
            branch_1 = layers.conv2d(
                branch_1, 208, [3, 3], scope='Conv2d_0b_3x3')
          with variable_scope.variable_scope('Branch_2'):
            branch_2 = layers.conv2d(net, 16, [1, 1], scope='Conv2d_0a_1x1')
            branch_2 = layers.conv2d(
                branch_2, 48, [3, 3], scope='Conv2d_0b_3x3')
          with variable_scope.variable_scope('Branch_3'):
            branch_3 = layers_lib.max_pool2d(
                net, [3, 3], scope='MaxPool_0a_3x3')
            branch_3 = layers.conv2d(
                branch_3, 64, [1, 1], scope='Conv2d_0b_1x1')
          net = array_ops.concat([branch_0, branch_1, branch_2, branch_3], 3)
        end_points[end_point] = net
        if final_endpoint == end_point:
          return net, end_points

        end_point = 'Mixed_4c'
        with variable_scope.variable_scope(end_point):
          with variable_scope.variable_scope('Branch_0'):
            branch_0 = layers.conv2d(net, 160, [1, 1], scope='Conv2d_0a_1x1')
          with variable_scope.variable_scope('Branch_1'):
            branch_1 = layers.conv2d(net, 112, [1, 1], scope='Conv2d_0a_1x1')
            branch_1 = layers.conv2d(
                branch_1, 224, [3, 3], scope='Conv2d_0b_3x3')
          with variable_scope.variable_scope('Branch_2'):
            branch_2 = layers.conv2d(net, 24, [1, 1], scope='Conv2d_0a_1x1')
            branch_2 = layers.conv2d(
                branch_2, 64, [3, 3], scope='Conv2d_0b_3x3')
          with variable_scope.variable_scope('Branch_3'):
            branch_3 = layers_lib.max_pool2d(
                net, [3, 3], scope='MaxPool_0a_3x3')
            branch_3 = layers.conv2d(
                branch_3, 64, [1, 1], scope='Conv2d_0b_1x1')
          net = array_ops.concat([branch_0, branch_1, branch_2, branch_3], 3)
        end_points[end_point] = net
        if final_endpoint == end_point:
          return net, end_points

        end_point = 'Mixed_4d'
        with variable_scope.variable_scope(end_point):
          with variable_scope.variable_scope('Branch_0'):
            branch_0 = layers.conv2d(net, 128, [1, 1], scope='Conv2d_0a_1x1')
          with variable_scope.variable_scope('Branch_1'):
            branch_1 = layers.conv2d(net, 128, [1, 1], scope='Conv2d_0a_1x1')
            branch_1 = layers.conv2d(
                branch_1, 256, [3, 3], scope='Conv2d_0b_3x3')
          with variable_scope.variable_scope('Branch_2'):
            branch_2 = layers.conv2d(net, 24, [1, 1], scope='Conv2d_0a_1x1')
            branch_2 = layers.conv2d(
                branch_2, 64, [3, 3], scope='Conv2d_0b_3x3')
          with variable_scope.variable_scope('Branch_3'):
            branch_3 = layers_lib.max_pool2d(
                net, [3, 3], scope='MaxPool_0a_3x3')
            branch_3 = layers.conv2d(
                branch_3, 64, [1, 1], scope='Conv2d_0b_1x1')
          net = array_ops.concat([branch_0, branch_1, branch_2, branch_3], 3)
        end_points[end_point] = net
        if final_endpoint == end_point:
          return net, end_points

        end_point = 'Mixed_4e'
        with variable_scope.variable_scope(end_point):
          with variable_scope.variable_scope('Branch_0'):
            branch_0 = layers.conv2d(net, 112, [1, 1], scope='Conv2d_0a_1x1')
          with variable_scope.variable_scope('Branch_1'):
            branch_1 = layers.conv2d(net, 144, [1, 1], scope='Conv2d_0a_1x1')
            branch_1 = layers.conv2d(
                branch_1, 288, [3, 3], scope='Conv2d_0b_3x3')
          with variable_scope.variable_scope('Branch_2'):
            branch_2 = layers.conv2d(net, 32, [1, 1], scope='Conv2d_0a_1x1')
            branch_2 = layers.conv2d(
                branch_2, 64, [3, 3], scope='Conv2d_0b_3x3')
          with variable_scope.variable_scope('Branch_3'):
            branch_3 = layers_lib.max_pool2d(
                net, [3, 3], scope='MaxPool_0a_3x3')
            branch_3 = layers.conv2d(
                branch_3, 64, [1, 1], scope='Conv2d_0b_1x1')
          net = array_ops.concat([branch_0, branch_1, branch_2, branch_3], 3)
        end_points[end_point] = net
        if final_endpoint == end_point:
          return net, end_points

        end_point = 'Mixed_4f'
        with variable_scope.variable_scope(end_point):
          with variable_scope.variable_scope('Branch_0'):
            branch_0 = layers.conv2d(net, 256, [1, 1], scope='Conv2d_0a_1x1')
          with variable_scope.variable_scope('Branch_1'):
            branch_1 = layers.conv2d(net, 160, [1, 1], scope='Conv2d_0a_1x1')
            branch_1 = layers.conv2d(
                branch_1, 320, [3, 3], scope='Conv2d_0b_3x3')
          with variable_scope.variable_scope('Branch_2'):
            branch_2 = layers.conv2d(net, 32, [1, 1], scope='Conv2d_0a_1x1')
            branch_2 = layers.conv2d(
                branch_2, 128, [3, 3], scope='Conv2d_0b_3x3')
          with variable_scope.variable_scope('Branch_3'):
            branch_3 = layers_lib.max_pool2d(
                net, [3, 3], scope='MaxPool_0a_3x3')
            branch_3 = layers.conv2d(
                branch_3, 128, [1, 1], scope='Conv2d_0b_1x1')
          net = array_ops.concat([branch_0, branch_1, branch_2, branch_3], 3)
        end_points[end_point] = net
        if final_endpoint == end_point:
          return net, end_points

        end_point = 'MaxPool_5a_2x2'
        net = layers_lib.max_pool2d(net, [2, 2], stride=2, scope=end_point)
        end_points[end_point] = net
        if final_endpoint == end_point:
          return net, end_points

        end_point = 'Mixed_5b'
        with variable_scope.variable_scope(end_point):
          with variable_scope.variable_scope('Branch_0'):
            branch_0 = layers.conv2d(net, 256, [1, 1], scope='Conv2d_0a_1x1')
          with variable_scope.variable_scope('Branch_1'):
            branch_1 = layers.conv2d(net, 160, [1, 1], scope='Conv2d_0a_1x1')
            branch_1 = layers.conv2d(
                branch_1, 320, [3, 3], scope='Conv2d_0b_3x3')
          with variable_scope.variable_scope('Branch_2'):
            branch_2 = layers.conv2d(net, 32, [1, 1], scope='Conv2d_0a_1x1')
            branch_2 = layers.conv2d(
                branch_2, 128, [3, 3], scope='Conv2d_0a_3x3')
          with variable_scope.variable_scope('Branch_3'):
            branch_3 = layers_lib.max_pool2d(
                net, [3, 3], scope='MaxPool_0a_3x3')
            branch_3 = layers.conv2d(
                branch_3, 128, [1, 1], scope='Conv2d_0b_1x1')
          net = array_ops.concat([branch_0, branch_1, branch_2, branch_3], 3)
        end_points[end_point] = net
        if final_endpoint == end_point:
          return net, end_points

        end_point = 'Mixed_5c'
        with variable_scope.variable_scope(end_point):
          with variable_scope.variable_scope('Branch_0'):
            branch_0 = layers.conv2d(net, 384, [1, 1], scope='Conv2d_0a_1x1')
          with variable_scope.variable_scope('Branch_1'):
            branch_1 = layers.conv2d(net, 192, [1, 1], scope='Conv2d_0a_1x1')
            branch_1 = layers.conv2d(
                branch_1, 384, [3, 3], scope='Conv2d_0b_3x3')
          with variable_scope.variable_scope('Branch_2'):
            branch_2 = layers.conv2d(net, 48, [1, 1], scope='Conv2d_0a_1x1')
            branch_2 = layers.conv2d(
                branch_2, 128, [3, 3], scope='Conv2d_0b_3x3')
          with variable_scope.variable_scope('Branch_3'):
            branch_3 = layers_lib.max_pool2d(
                net, [3, 3], scope='MaxPool_0a_3x3')
            branch_3 = layers.conv2d(
                branch_3, 128, [1, 1], scope='Conv2d_0b_1x1')
          net = array_ops.concat([branch_0, branch_1, branch_2, branch_3], 3)
        end_points[end_point] = net
        if final_endpoint == end_point:
          return net, end_points
    raise ValueError('Unknown final endpoint %s' % final_endpoint)
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章