Inception v4 & Inception-ResNet:https://arxiv.org/abs/1602.07261
keras 代碼:unofficial-keras : https://github.com/titu1994/Inception-v4
Inception v4 & Inception-ResNet v1、v2
0. 前言
主要受ResNet 網絡的啓發,他們基於inception v3的基礎上,引入了殘差結構,提出了inception-resnet-v1和inception-resnet-v2,並修改inception模塊提出了inception v4結構。基於inception v4的網絡實驗發現在不引入殘差結構的基礎上也能達到和inception-resnet-v2結構相似的結果。
google認爲他們之前在改變架構選擇上相對保守:網絡結構的改變只侷限於獨立的網絡組件範圍內,從而保持剩下模型穩定。而現在,他們決定拋棄之前那個設計原則,對不同尺度的網格都採用統一的inception模塊。在下面的網絡結構圖中:所有後面不帶V的卷積,用的都是same-padded,也就是輸出的網格大小等於輸入網格的大小(如vgg的卷積一樣);帶V的使用的是valid-padded,表示輸出的網格尺寸是會逐步減小的。
Inception-ResNet-V1的結果與Inception v3相當;Inception-ResNet-V2與Inception v4結果差不多,不過實際過程中Inception v4會明顯慢於Inception-ResNet-v2,這也許是因爲層數太多了。且在Inception-ResNet結構中,只在傳統層的上面使用BN層,而不在合併層上使用BN.
1. Inception v4
圖9爲Inception v4的整體框架,圖9中的各個模塊按照順序,其分框架分別爲圖3、4、7、5、8、6。
2. Inception-ResNet v1 & v2
2.1 Inception-ResNet v1 &v2網絡結構流程圖
2.2 Inception-ResNet v1網絡模塊
按照圖15的順序,各個模塊分別爲圖14、10、7、11、12、13。
2.3 Inception-ResNet v2網絡模塊
按照圖15的順序,各個模塊分別爲圖3、16、7、17、18、19。
2.4 網絡的其他細節
- 每一個inception模塊中都有一個1×1帶有線性激活的卷積層,用來擴展通道數,從而補償因爲inception模塊導致的維度相減。
- 在Inception-ResNet結構中,只在傳統層的上面使用BN層,而不在合併層上使用BN。
- Inception v4、Inception-ResNet v1、v2採用相同的Reduction A結構,只是參數不同,如表1。
- Inception-ResNet v1和v2網絡的結構都是相同的,只是在濾波器個數上有差別。
- 如果通道數超過1000,那麼Inception-resnet等網絡都會開始變得不穩定,並且過早的就“死掉了”,即在迭代幾萬次之後,平均池化的前面一層就會生成很多的0值。作者的解決辦法是在將殘差匯入之前,對殘差進行縮小,可以讓模型穩定訓練,值通常選擇 [0,1.0.3],如Figure 20。
- 在ResNet-v1中,何凱明等人也在cifar-10中發現了模型的不穩定現象:即在特別深的網絡基礎上去訓cifar-10,需要先以0.01的學習率去訓練,然後在以0.1的學習率訓練。不過這裏的作者們認爲如果通道數特別多的話,即使以特別低的學習率(0.00001)訓練也無法讓模型收斂,如果之後再用大學習率,那麼就會輕鬆的破壞掉之前的成果。然而簡單的縮小殘差的輸出值有助於學習的穩定,即使進行了簡單的縮小,那麼對最終結果也造成不了多大的損失,反而有助於穩定訓練。
2. 5 實驗結論
- 在inception-resnet-v1與inception v3的對比中,inception-resnet-v1雖然訓練速度更快,不過最後結果有那麼一丟丟的差於inception v3;
- 而在inception-resnet-v2與inception v4的對比中,inception-resnet-v2的訓練速度更塊,而且結果比inception v4也更好一點。所以最後勝出的就是inception-resnet-v2。
3. Inception-ResNet v2代碼
採用tensorflow2.0, tf.keras實現.
"""
Implementation of Inception-Residual Network v1 [Inception Network v4 Paper](http://arxiv.org/pdf/1602.07261v1.pdf) in Keras.
Some additional details:
[1] Each of the A, B and C blocks have a 'scale_residual' parameter.
The scale residual parameter is according to the paper. It is however turned OFF by default.
Simply setting 'scale=True' in the create_inception_resnet_v2() method will add scaling.
[2] There were minor inconsistencies with filter size in both B and C blocks.
In the B blocks: 'ir_conv' nb of filters is given as 1154, however input size is 1152.
This causes inconsistencies in the merge-add mode, therefore the 'ir_conv' filter size
is reduced to 1152 to match input size.
In the C blocks: 'ir_conv' nb of filter is given as 2048, however input size is 2144.
This causes inconsistencies in the merge-add mode, therefore the 'ir_conv' filter size
is increased to 2144 to match input size.
Currently trying to find a proper solution with original nb of filters.
[3] In the stem function, the last Convolutional2D layer has 384 filters instead of the original 256.
This is to correctly match the nb of filters in 'ir_conv' of the next A blocks.
"""
import os
import tensorflow as tf
import tensorflow.keras as keras
from tensorflow.keras import Sequential, layers
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
class BasicCon2D(keras.Model):
"""This is basic convolution operation, conv+bn+relu"""
def __init__(self, filter_nums, **kwargs):
super(BasicCon2D, self).__init__()
self.conv = layers.Conv2D(filter_nums, use_bias=False, **kwargs)
self.bn = layers.BatchNormalization()
self.relu = layers.Activation('relu')
def call(self, inputs, training=None):
out = self.conv(inputs)
out = self.bn(out)
out = self.relu(out)
return out
class InceptionStem(keras.Model):
"""This is stem network of Inception-ResNet v2 and the input part"""
def __init__(self):
super(InceptionStem, self).__init__()
self.conv = Sequential([
BasicCon2D(32, kernel_size=(3, 3), strides=2),
BasicCon2D(32, kernel_size=(3, 3)),
BasicCon2D(64, kernel_size=(3, 3), padding='same')
])
self.branch_pool1a = layers.MaxPool2D((3, 3), strides=2)
self.branch_pool1b = BasicCon2D(96, kernel_size=(3, 3), strides=2)
self.branch_conva = Sequential([
BasicCon2D(64, kernel_size=(1, 1), padding='same'),
BasicCon2D(96, kernel_size=(3, 3))])
self.branch_convb = Sequential([
BasicCon2D(64, kernel_size=(1, 1), padding='same'),
BasicCon2D(64, kernel_size=(7, 1), padding='same'),
BasicCon2D(64, kernel_size=(1, 7), padding='same'),
BasicCon2D(96, kernel_size=(3, 3))
])
self.branch_pool2a = layers.MaxPool2D((3, 3), strides=2)
self.branch_pool2b = BasicCon2D(192, kernel_size=(3, 3), strides=2)
def call(self, inputs, training=None):
out = self.conv(inputs)
out = [
self.branch_pool1a(out),
self.branch_pool1b(out)
]
out = layers.concatenate(out, axis=-1)
out = [
self.branch_conva(out),
self.branch_convb(out)
]
out = layers.concatenate(out, axis=-1)
out = [
self.branch_pool2a(out),
self.branch_pool2b(out)
]
out = layers.concatenate(out, axis=-1)
return out
class InceptionResnetV2A(keras.Model):
"""The schema for 35 × 35 grid modules of the
Inception-ResNetv2 network. This is the Inception-A block """
def __init__(self, scale, scale_rate):
super(InceptionResnetV2A, self).__init__()
self.branch1x1 = BasicCon2D(32, kernel_size=(1, 1), padding='same')
self.branch3x3 = Sequential([
BasicCon2D(32, kernel_size=(1, 1), padding='same'),
BasicCon2D(32, kernel_size=(3, 3), padding='same')
])
self.branch3x3_stack = Sequential([
BasicCon2D(32, kernel_size=(1, 1), padding='same'),
BasicCon2D(48, kernel_size=(3, 3), padding='same'),
BasicCon2D(64, kernel_size=(3, 3), padding='same')
])
self.reduction1x1 = layers.Conv2D(384, kernel_size=(1, 1), padding='same')
if scale:
self.scale_residual = layers.Lambda(lambda x: x*scale_rate)
else:
self.scale_residual = layers.Lambda(lambda x: x)
def call(self, inputs, training=None):
out = [
self.branch1x1(inputs),
self.branch3x3(inputs),
self.branch3x3_stack(inputs)
]
out = layers.concatenate(out, axis=-1)
out = self.reduction1x1(out)
out = self.scale_residual(out)
out = layers.add([inputs, out])
return out
class InceptionResnetV2B(keras.Model):
""" The schema for 17 × 17 grid (Inception-ResNet-B) module of the Inception-ResNet-v2 network."""
def __init__(self, scale, scale_rate):
super(InceptionResnetV2B, self).__init__()
self.branch1x1 = BasicCon2D(192, kernel_size=(1, 1), padding='same')
self.branch7x7 = Sequential([
BasicCon2D(128, kernel_size=(1, 1), padding='same'),
BasicCon2D(160, kernel_size=(1, 7), padding='same'),
BasicCon2D(192, kernel_size=(7, 1), padding='same')
])
self.reduction1x1 = layers.Conv2D(1152, kernel_size=(1, 1), padding='same')
if scale:
self.scale_residual = layers.Lambda(lambda x: x*scale_rate)
else:
self.scale_residual = layers.Lambda(lambda x: x)
def call(self, inputs, training=None):
out = [
self.branch1x1(inputs),
self.branch7x7(inputs)
]
out = layers.concatenate(out, axis=-1)
out = self.reduction1x1(out)
out = self.scale_residual(out)
out = layers.add([inputs, out])
return out
class InceptionResnetV2C(keras.Model):
"""The schema for 8×8 grid (Inception-ResNet-C)."""
def __init__(self, scale, scale_rate):
super(InceptionResnetV2C, self).__init__()
self.branch1x1 = BasicCon2D(192, kernel_size=(1, 1), padding='same')
self.branch3x3 = Sequential([
BasicCon2D(192, kernel_size=(1, 1), padding='same'),
BasicCon2D(224, kernel_size=(1, 3), padding='same'),
BasicCon2D(256, kernel_size=(3, 1), padding='same')
])
self.reduction1x1 = layers.Conv2D(2144, kernel_size=(1, 1), padding='same')
if scale:
self.scale_residual = layers.Lambda(lambda x: x*scale_rate)
else:
self.scale_residual = layers.Lambda(lambda x: x)
def call(self, inputs, training=None):
out = [
self.branch1x1(inputs),
self.branch3x3(inputs)
]
out = layers.concatenate(out, axis=-1)
out = self.reduction1x1(out)
out = self.scale_residual(out)
out = layers.add([inputs, out])
return out
class ReductionA(keras.Model):
"""The schema for 35 × 35 to 17 × 17 reduction module.
The k, l, m, n numbers represent filter bank sizes, i.e. 256, 256, 384, 384"""
def __init__(self):
super(ReductionA, self).__init__()
self.branch_pool = layers.MaxPool2D((3, 3), strides=2)
self.branch3x3 = BasicCon2D(384, kernel_size=(3, 3), strides=2)
self.branch3x3_stack = Sequential([
BasicCon2D(256, kernel_size=(1, 1), padding='same'),
BasicCon2D(256, kernel_size=(3, 3), padding='same'),
BasicCon2D(384, kernel_size=(3, 3), strides=2)
])
def call(self, inputs, training=None):
out = [
self.branch_pool(inputs),
self.branch3x3(inputs),
self.branch3x3_stack(inputs)
]
out = layers.concatenate(out, axis=-1)
return out
class ReductionResnetV2B(keras.Model):
"""The schema for 17 × 17 to 8 × 8 grid-reduction module."""
def __init__(self):
super(ReductionResnetV2B, self).__init__()
self.branch_pool = layers.MaxPool2D((3, 3), strides=2)
self.branch3x3a = Sequential([
BasicCon2D(256, kernel_size=(1, 1), padding='same'),
BasicCon2D(384, kernel_size=(3, 3), strides=2)
])
self.branch3x3b = Sequential([
BasicCon2D(256, kernel_size=(1, 1), padding='same'),
BasicCon2D(288, kernel_size=(3, 3), strides=2)
])
self.branch3x3_stack = Sequential([
BasicCon2D(256, kernel_size=(1, 1), padding='same'),
BasicCon2D(288, kernel_size=(3, 3), padding='same'),
BasicCon2D(320, kernel_size=(3, 3), strides=2)
])
def call(self, inputs, training=None):
out = [
self.branch_pool(inputs),
self.branch3x3a(inputs),
self.branch3x3b(inputs),
self.branch3x3_stack(inputs)
]
out = layers.concatenate(out, axis=-1)
return out
class InceptionResnetV2(keras.Model):
"""This is Inception-ResNet-v2 network."""
def __init__(self, num_inception_a, num_inception_b, num_inception_c, scale=True, scale_rate=0.1, num_classes=10):
super(InceptionResnetV2, self).__init__()
self.stem = InceptionStem()
self.inception_resnet_a = self._generate_inception_module(InceptionResnetV2A, num_inception_a, scale=scale, scale_rate=scale_rate)
self.reduction_a = ReductionA()
self.inception_resnet_b = self._generate_inception_module(InceptionResnetV2B, num_inception_b, scale=scale, scale_rate=scale_rate)
self.reduction_b = ReductionResnetV2B()
self.inception_resnet_c = self._generate_inception_module(InceptionResnetV2C, num_inception_c, scale=scale, scale_rate=scale_rate)
self.avg_pool = layers.GlobalAveragePooling2D()
self.drop_out = layers.Dropout(0.2)
self.fc = layers.Dense(num_classes)
def call(self, inputs, training=None):
out = self.stem(inputs)
out = self.inception_resnet_a(out)
out = self.reduction_a(out)
out = self.inception_resnet_b(out)
out = self.reduction_b(out)
out = self.inception_resnet_c(out)
out = self.avg_pool(out)
out = self.drop_out(out)
out = self.fc(out)
return out
@staticmethod
def _generate_inception_module(inception, inception_nums, scale, scale_rate):
layers = []
for i in range(inception_nums):
layers.append(inception(scale=scale, scale_rate=scale_rate))
return Sequential(layers)
if __name__ == '__main__':
model = InceptionResnetV2(5, 10, 5)
model.build(input_shape=(None, 299, 299, 3))
model.summary()
print(model.predict(tf.ones((10, 299, 299, 3))).shape)