tensorflow中slim模塊api介紹

github:https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/slim
TensorFlow-Slim
TF-Slim is a lightweight library for defining, training and evaluating complexmodels in TensorFlow. Components of tf-slim can be freely mixed with nativetensorflow, as well as other frameworks, such as tf.contrib.learn.

TF-Slim是tensorflow中定義、訓練和評估複雜模型的輕量級庫。tf-slim中的組件可以輕易地和原生tensorflow框架以及例如tf.contrib.learn這樣的框架進行整合。
Usage
[python] view plain copy
import tensorflow.contrib.slim as slim

Why TF-Slim?
TF-Slim is a library that makes building, training and evaluation neuralnetworks simple:

Allows the user to define models much more compactly by eliminatingboilerplate code. This is accomplished through the use ofargument scopingand numerous high levellayersandvariables.These tools increase readability and maintainability, reduce the likelihoodof an error from copy-and-pasting hyperparameter values and simplifieshyperparameter tuning.
Makes developing models simple by providing commonly usedregularizers.
Several widely used computer vision models (e.g., VGG, AlexNet) have beendeveloped in slim, and areavailableto users. These can either be used as black boxes, or can be extended in variousways, e.g., by adding “multiple heads” to different internal layers.
Slim makes it easy to extend complex models, and to warm start trainingalgorithms by using pieces of pre-existing model checkpoints.
What are the various components of TF-Slim?
TF-Slim is composed of several parts which were design to exist independently.These include the following main pieces (explained in detail below).

arg_scope:provides a new scope namedarg_scope that allows a user to define defaultarguments for specific operations within that scope.
data:contains TF-slim’sdatasetdefinition,data providers,parallel_reader,anddecodingutilities.
evaluation:contains routines for evaluating models.
layers:contains high level layers for building models using tensorflow.
learning:contains routines for training models.
losses:contains commonly used loss functions.
metrics:contains popular evaluation metrics.
nets:contains popular network definitions such asVGGandAlexNetmodels.
queues:provides a context manager for easily and safely starting and closingQueueRunners.
regularizers:contains weight regularizers.
variables:provides convenience wrappers for variable creation and manipulation.
Defining Models
Models can be succinctly defined using TF-Slim by combining its variables, layers and scopes. Each of these elements are defined below.

利用TF-Slim通過合併variables, layers and scopes,模型可以簡潔地進行定義。各元素定義如下。
Variables
Creating Variables in native tensorflow requires either a predefined value or an initialization mechanism (e.g. randomly sampled from a Gaussian). Furthermore, if a variable needs to be created on a specific device, such as a GPU, the specification must be made explicit.To alleviate the code required for variable creation, TF-Slim provides a set of thin wrapper functions invariables.py which allow callers to easily define variables.

想在原生tensorflow中創建變量,要麼需要一個預定義值,要麼需要一種初始化機制。此外,如果變量需要在特定的設備上創建,比如GPU上,則必要要顯式指定。爲了簡化代碼的變量創建,TF-Slim在variables.py中提供了一批輕量級的函數封裝,從而是調用者可以更加容易地定義變量。
For example, to create a weight variable, initialize it using a truncated_normal distribution, regularize it with an l2_loss and place it on the CPU, one need only declare the following:

例如,創建一個權值變量,並且用truncated_normal初始化,用L2損失正則化,放置於CPU中,我們只需要定義如下:

weights = slim.variable('weights',  
                             shape=[10, 10, 3 , 3],  
                             initializer=tf.truncated_normal_initializer(stddev=0.1),  
                             regularizer=slim.l2_regularizer(0.05),  
                             device='/CPU:0')  

Note that in native TensorFlow, there are two types of variables: regular variables and local (transient) variables. The vast majority of variables are regular variables: once created, they can be saved to disk using a saver. Local variables are those variables that only exist for the duration of a session and are not saved to disk.

在原生tensorflow中,有兩種類型的變量:常規變量和局部(臨時)變量。絕大部分都是常規變量,它們一旦創建,可以用Saver保存在磁盤上。局部變量則只在一個session期間存在,且不會保存在磁盤上。
TF-Slim further differentiates variables by definingmodel variables, which are variables that represent parameters of a model. Model variables are trained or fine-tuned during learning and are loaded from a checkpoint during evaluation or inference. Examples include the variables created by aslim.fully_connected orslim.conv2d layer. Non-model variables are all other variables that are used during learning or evaluation but are not required for actually performing inference. For example, theglobal_step is a variable using during learning and evaluation but it is not actually part of the model. Similarly, moving average variables might mirror model variables,but the moving averages are not themselves model variables.

TF-Slim通過定義model variables可以進一步區分變量,這種變量代表一個模型的參數。模型變量在學習階段被訓練或微調,在評估和預測階段從checkpoint中加載。比如通過slim.fully_connected orslim.conv2d進行創建的變量。非模型變量是在學習或評估階段使用,但不會在預測階段起作用的變量。例如global_step,它在學習和評估階段使用,但不是模型的一部分。類似地,移動均值可以mirror模型參數,但是它們本身不是模型變量。
Both model variables and regular variables can be easily created and retrieved via TF-Slim:

通過TF-Slim,模型變量和常規變量都可以很容易地創建和獲取:

# Model Variables  
weights = slim.model_variable('weights',  
                              shape=[10, 10, 3 , 3],  
                              initializer=tf.truncated_normal_initializer(stddev=0.1),  
                              regularizer=slim.l2_regularizer(0.05),  
                              device='/CPU:0')  
model_variables = slim.get_model_variables()  

# Regular variables  
my_var = slim.variable('my_var',  
                       shape=[20, 1],  
                       initializer=tf.zeros_initializer())  

regular_variables_and_model_variables = slim.get_variables()
How does this work? When you create a model variable via TF-Slim’s layers or directly via theslim.model_variable function, TF-Slim adds the variable to thetf.GraphKeys.MODEL_VARIABLES collection. What if you have your own custom layers or variable creation routine but still want TF-Slim to manage or be aware of your model variables? TF-Slim provides a convenience function for adding the model variable to its collection:

這玩意是怎麼起作用的呢?當你通過TF-Slim’s layers或者直接通過slim.model_variable函數創建一個模型變量,TF-Slim會把這個變量添加到tf.GraphKeys.MODEL_VARIABLES這個集合中。那我們自己的網絡層變量怎麼讓TF-Slim管理呢?TF-Slim提供了一個很方便的函數可以將模型的變量添加到集合中:

my_model_variable = CreateViaCustomCode()  

# Letting TF-Slim know about the additional variable.  
slim.add_model_variable(my_model_variable)  

Layers
While the set of TensorFlow operations is quite extensive, developers of neural networks typically think of models in terms of higher level concepts like “layers”, “losses”, “metrics”, and “networks”. A layer,such as a Convolutional Layer, a Fully Connected Layer or a BatchNorm Layer are more abstract than a single TensorFlow operation and typically involve several operations. Furthermore, a layer usually (but not always) has variables (tunable parameters) associated with it, unlike more primitive operations. For example, a Convolutional Layer in a neural networkis composed of several low level operations:

tensorflow的操作符集合是十分廣泛的,神經網絡開發者通常會以更高層的概念,比如”layers”, “losses”, “metrics”, and “networks”去考慮模型。一個層,比如卷積層、全連接層或bn層,要比一個單獨的tensorflow操作符更抽象,並且通常會包含若干操作符。此外,和原始操作符不同,一個層經常(不總是)有一些與自己相關的變量(可調參數)。例如,在神經網絡中,一個卷積層由許多底層操作符組成:

Creating the weight and bias variables
Convolving the weights with the input from the previous layer
Adding the biases to the result of the convolution.
Applying an activation function.
1. 創建權重、偏置變量
2. 將來自上一層的數據和權值進行卷積
3. 在卷積結果上加上偏置
4. 應用激活函數
Using only plain TensorFlow code, this can be rather laborious:

如果只用普通的tensorflow代碼,幹這個事是相當的費事:

input = ...  
with tf.name_scope('conv1_1') as scope:  
  kernel = tf.Variable(tf.truncated_normal([3, 3, 64, 128], dtype=tf.float32,  
                                           stddev=1e-1), name='weights')  
  conv = tf.nn.conv2d(input, kernel, [1, 1, 1, 1], padding='SAME')  
  biases = tf.Variable(tf.constant(0.0, shape=[128], dtype=tf.float32),  
                       trainable=True, name='biases')  
  bias = tf.nn.bias_add(conv, biases)  
  conv1 = tf.nn.relu(bias, name=scope)  

To alleviate the need to duplicate this code repeatedly, TF-Slim provides a number of convenient operations defined at the more abstract level of neural network layers. For example, compare the code above to an invocation of the corresponding TF-Slim code:

爲了緩解重複這些代碼,TF-Slim在更抽象的神經網絡層的層面上提供了大量方便使用的操作符。比如,將上面的代碼和TF-Slim響應的代碼調用進行比較:

input = ...  
net = slim.conv2d(input, 128, [3, 3], scope='conv1_1')  

TF-Slim provides standard implementations for numerous components for building neural networks. These include:

TF-Slim提供了標準接口用於組建神經網絡,包括:
Layer TF-Slim
BiasAdd slim.bias_add
BatchNorm slim.batch_norm
Conv2d slim.conv2d
Conv2dInPlane slim.conv2d_in_plane
Conv2dTranspose (Deconv) slim.conv2d_transpose
FullyConnected slim.fully_connected
AvgPool2D slim.avg_pool2d
Dropout slim.dropout
Flatten slim.flatten
MaxPool2D slim.max_pool2d
OneHotEncoding slim.one_hot_encoding
SeparableConv2 slim.separable_conv2d
UnitNorm slim.unit_norm
TF-Slim also provides two meta-operations called repeat andstack that allow users to repeatedly perform the same operation. For example, consider the following snippet from the VGG network whose layers perform several convolutions in a row between pooling layers:

TF-Slim也提供了兩個元運算符—-repeat和stack,允許用戶可以重複地使用相同的運算符。例如,VGG網絡的一個片段,這個網絡在兩個池化層之間就有許多卷積層的堆疊:

net = ...  
net = slim.conv2d(net, 256, [3, 3], scope='conv3_1')  
net = slim.conv2d(net, 256, [3, 3], scope='conv3_2')  
net = slim.conv2d(net, 256, [3, 3], scope='conv3_3')  
net = slim.max_pool2d(net, [2, 2], scope='pool2')  

One way to reduce this code duplication would be via a for loop:

一種減少這種代碼重複的方法是使用for循環:

net = ...  
for i in range(3):  
  net = slim.conv2d(net, 256, [3, 3], scope='conv3_' % (i+1))  
net = slim.max_pool2d(net, [2, 2], scope='pool2')  

This can be made even cleaner by using TF-Slim’s repeat operation:

若使用TF-Slim的repeat操作符,代碼看起來會更簡潔:

net = slim.repeat(net, 3, slim.conv2d, 256, [3, 3], scope='conv3')  
net = slim.max_pool2d(net, [2, 2], scope='pool2')  

Notice that the slim.repeat not only applies the same argument in-line, it also is smart enough to unroll the scopes such that the scopes assigned to each subsequent call of slim.conv2d are appended with an underscore and iterationnumber. More concretely, the scopes in the example above would be named ‘conv3/conv3_1’, ‘conv3/conv3_2’ and ‘conv3/conv3_3’.

slim.repeat不但可以在一行中使用相同的參數,而且還能智能地展開scope,即每個後續的slim.conv2d調用所對應的scope都會追加下劃線及迭代數字。更具體地講,上面代碼的scope分別爲 ‘conv3/conv3_1’, ‘conv3/conv3_2’ and ‘conv3/conv3_3’.

Furthermore, TF-Slim’s slim.stack operator allows a caller to repeatedly apply the same operation with different arguments to create a stack or tower of layers. slim.stack also creates a new tf.variable_scope for each operation created. For example, a simple way to create a Multi-Layer Perceptron(MLP):

除此之外,TF-Slim的slim.stack操作符允許調用者用不同的參數重複使用相同的操作符是創建一個stack或網絡層塔。slim.stack也會爲每個創建的操作符生成一個新的scope。例如,下面是一個簡單的方法去創建MLP:

# Verbose way:  
x = slim.fully_connected(x, 32, scope='fc/fc_1')  
x = slim.fully_connected(x, 64, scope='fc/fc_2')  
x = slim.fully_connected(x, 128, scope='fc/fc_3')  

# Equivalent, TF-Slim way using slim.stack:  
slim.stack(x, slim.fully_connected, [32, 64, 128], scope='fc')  

In this example, slim.stack calls slim.fully_connected three times passing the output of one invocation of the function to the next. However, the number of hidden units in each invocation changes from 32 to 64 to 128. Similarly, one can use stack to simplify a tower of multiple convolutions:

在這個例子中,slim.stack調用slim.fully_connected 三次,前一個層的輸出是下一層的輸入。而每個網絡層的輸出通道數從32變到64,再到128. 同樣,我們可以用stack簡化一個多卷積層塔:

# Verbose way:  
x = slim.conv2d(x, 32, [3, 3], scope='core/core_1')  
x = slim.conv2d(x, 32, [1, 1], scope='core/core_2')  
x = slim.conv2d(x, 64, [3, 3], scope='core/core_3')  
x = slim.conv2d(x, 64, [1, 1], scope='core/core_4')  

# Using stack:  
slim.stack(x, slim.conv2d, [(32, [3, 3]), (32, [1, 1]), (64, [3, 3]), (64, [1, 1])], scope='core')  

Scopes
In addition to the types of scope mechanisms in TensorFlow(name_scope, variable_scope ),TF-Slim adds a new scoping mechanism called arg_scope, This new scope allows a user to specify one or more operations and a set of arguments which will be passed to each of the operations defined in the arg_scope. This functionality is best illustrated by example. Consider the following code:

除了tensorflow中自帶的scope機制類型(name_scope, variable_scope)外, TF-Slim添加了一種叫做arg_scope的scope機制。這種scope允許用戶在arg_scope中指定若干操作符以及一批參數,這些參數會傳給前面所有的操作符中。參見以下代碼:

net = slim.conv2d(inputs, 64, [11, 11], 4, padding='SAME',  
                  weights_initializer=tf.truncated_normal_initializer(stddev=0.01),  
                  weights_regularizer=slim.l2_regularizer(0.0005), scope='conv1')  
net = slim.conv2d(net, 128, [11, 11], padding='VALID',  
                  weights_initializer=tf.truncated_normal_initializer(stddev=0.01),  
                  weights_regularizer=slim.l2_regularizer(0.0005), scope='conv2')  
net = slim.conv2d(net, 256, [11, 11], padding='SAME',  
                  weights_initializer=tf.truncated_normal_initializer(stddev=0.01),  
                  weights_regularizer=slim.l2_regularizer(0.0005), scope='conv3')  

It should be clear that these three convolution layers share many of the same hyper parameters. Two have the same padding, all three have the same weights_initializer and weight_regularizer. This code is hard to read and contains a lot of repeated values that should be factored out. One solution would be to specify default values using variables:

很明顯,這三個卷積層有很多超參數都是相同的。有兩個卷積層有相同的padding設置,而且這三個卷積層都有相同的weights_initializer(權值初始化器)和weight_regularizer(權值正則化器)。這段代碼很難讀,且包含了很多重複的參數值。一種解決辦法是用變量指定默認值:

padding = 'SAME'  
initializer = tf.truncated_normal_initializer(stddev=0.01)  
regularizer = slim.l2_regularizer(0.0005)  
net = slim.conv2d(inputs, 64, [11, 11], 4,  
                  padding=padding,  
                  weights_initializer=initializer,  
                  weights_regularizer=regularizer,  
                  scope='conv1')  
net = slim.conv2d(net, 128, [11, 11],  
                  padding='VALID',  
                  weights_initializer=initializer,  
                  weights_regularizer=regularizer,  
                  scope='conv2')  
net = slim.conv2d(net, 256, [11, 11],  
                  padding=padding,  
                  weights_initializer=initializer,  
                  weights_regularizer=regularizer,  
                  scope='conv3')  

This solution ensures that all three convolutions share the exact same parameter values but doesn’t reduce completely the code clutter. By using anarg_scope,we can both ensure that each layer uses the same values and simplify the code:

這種方式可以確保這三個卷積層共享相同的參數值,但是仍然沒有減少代碼規模。通過使用arg_scope,我們既能確保每層共享參數值,又能精簡代碼:

with slim.arg_scope([slim.conv2d], padding='SAME',  
                      weights_initializer=tf.truncated_normal_initializer(stddev=0.01)  
                      weights_regularizer=slim.l2_regularizer(0.0005)):  
    net = slim.conv2d(inputs, 64, [11, 11], scope='conv1')  
    net = slim.conv2d(net, 128, [11, 11], padding='VALID', scope='conv2')  
    net = slim.conv2d(net, 256, [11, 11], scope='conv3')  

As the example illustrates, the use of arg_scope makes the code cleaner, simpler and easier to maintain. Notice that while argument values are specified in the arg_scope, they can be overwritten locally. In particular, while the padding argument has been set to ‘SAME’, the second convolution overrides it with the value of ‘VALID’.
如例所示,arg_scope使代碼更簡潔且易於維護。注意,在arg_scope中被指定的參數值,也可以在局部位置進行覆蓋。比如,padding參數設置爲’SAME’, 而第二個卷積層仍然可以通過把它設爲’VALID’而覆蓋掉arg_scope中的默認設置。
One can also nest arg_scopes and use multiple operations in the same scope.For example:

我們可以嵌套arg_scope, 也可以在一個scope中指定多個操作符,例如

with slim.arg_scope([slim.conv2d, slim.fully_connected],  
                      activation_fn=tf.nn.relu,  
                      weights_initializer=tf.truncated_normal_initializer(stddev=0.01),  
                      weights_regularizer=slim.l2_regularizer(0.0005)):  
  with slim.arg_scope([slim.conv2d], stride=1, padding='SAME'):  
    net = slim.conv2d(inputs, 64, [11, 11], 4, padding='VALID', scope='conv1')  
    net = slim.conv2d(net, 256, [5, 5],  
                      weights_initializer=tf.truncated_normal_initializer(stddev=0.03),  
                      scope='conv2')  
    net = slim.fully_connected(net, 1000, activation_fn=None, scope='fc')  

In this example, the firstarg_scope applies the sameweights_initializer andweights_regularizer arguments to theconv2d and fully_connected layers in its scope. In the secondarg_scope, additional default arguments to conv2d only are specified.
在這個例子中,第一個arg_scope對處於它的scope中的conv2d和fully_connected操作層應用相同的weights_initializer andweights_regularizer參數。在第二個arg_scope中,默認參數只是在conv2d中指定。
Working Example: Specifying the VGG16 Layers
By combining TF-Slim Variables, Operations and scopes, we can write a normally very complex network with very few lines of code. For example, the entire VGG architecture can bedefined with just the following snippet:

通過整合TF-Slim的變量、操作符和scope,我們可以用寥寥幾行代碼寫一個通常非常複雜的網絡。例如,完整的VGG結構只需要用下面的一小段代碼定義:

def vgg16(inputs):  
  with slim.arg_scope([slim.conv2d, slim.fully_connected],  
                      activation_fn=tf.nn.relu,  
                      weights_initializer=tf.truncated_normal_initializer(0.0, 0.01),  
                      weights_regularizer=slim.l2_regularizer(0.0005)):  
    net = slim.repeat(inputs, 2, slim.conv2d, 64, [3, 3], scope='conv1')  
    net = slim.max_pool2d(net, [2, 2], scope='pool1')  
    net = slim.repeat(net, 2, slim.conv2d, 128, [3, 3], scope='conv2')  
    net = slim.max_pool2d(net, [2, 2], scope='pool2')  
    net = slim.repeat(net, 3, slim.conv2d, 256, [3, 3], scope='conv3')  
    net = slim.max_pool2d(net, [2, 2], scope='pool3')  
    net = slim.repeat(net, 3, slim.conv2d, 512, [3, 3], scope='conv4')  
    net = slim.max_pool2d(net, [2, 2], scope='pool4')  
    net = slim.repeat(net, 3, slim.conv2d, 512, [3, 3], scope='conv5')  
    net = slim.max_pool2d(net, [2, 2], scope='pool5')  
    net = slim.fully_connected(net, 4096, scope='fc6')  
    net = slim.dropout(net, 0.5, scope='dropout6')  
    net = slim.fully_connected(net, 4096, scope='fc7')  
    net = slim.dropout(net, 0.5, scope='dropout7')  
    net = slim.fully_connected(net, 1000, activation_fn=None, scope='fc8')  
  return net  

Training Models
Training Tensorflow models requires a model, a loss function, the gradient computation and a training routine that iteratively computes the gradients of the model weights relative to the loss and updates the weights accordingly.TF-Slim provides both common loss functions and a set of helper functions that run the training and evaluation routines.

訓練一個tensorflow模型,需要一個網絡模型,一個損失函數,梯度計算方式和用於迭代計算模型權重的訓練過程。TF-Slim提供了損失函數,同時也提供了一批運行訓練和評估模型的幫助函數。

Losses
The loss function defines a quantity that we want to minimize. For classification problems, this is typically the cross entropy between the true distribution and the predicted probability distribution across classes. For regression problems, this is often the sum-of-squares differences between the predicted and true values.

損失函數定義了我們想最小化的量。對於分裂問題,它通常是真實分佈和預測概率分佈的交叉熵。對於迴歸問題,它通常是真實值和預測值的平方和。

Certain models, such as multi-task learning models, require the use of multiple loss functions simultaneously. In other words, the loss function ultimately being minimized is the sum of various other loss functions. For example, consider a model that predicts both the type of scene in an image as well as the depth from the camera of each pixel. This model’s loss function would be the sum of the classification loss and depth prediction loss.

對於特定的模型,比如多任務學習模型,可能需要同時使用多個損失函數。換句話說,正在最小化的損失函數是其他一些損失函數的和。例如,有一個模型既要預測圖像中場景的類型,又要預測每個像素的深度。那這個模型的損失函數就是分類損失和深度預測損失的和。

TF-Slim provides an easy-to-use mechanism for defining and keeping track of loss functions via the losses module. Consider the simple case where we want to train the VGG network:

TF-Slim通過losses模塊,提供了一種易用的機制去定義和跟蹤損失函數的足跡。看一個簡單的例子,我們想訓練VGG網絡:

import tensorflow as tf  
vgg = tf.contrib.slim.nets.vgg  

# Load the images and labels.  
images, labels = ...  

# Create the model.  
predictions, _ = vgg.vgg_16(images)  

# Define the loss functions and get the total loss.  
loss = slim.losses.softmax_cross_entropy(predictions, labels)  

In this example, we start by creating the model (using TF-Slim’s VGG implementation), and add the standard classification loss. Now, lets turn to the case where we have a multi-task model that produces multiple outputs:

在上面的例子中,我們首先創建了模型(用TF-Slim的VGG接口實現),並添加了標準的分類損失。現在,我們再看一個產生多輸出的多任務模型:

# Load the images and labels.  
images, scene_labels, depth_labels = ...  

# Create the model.  
scene_predictions, depth_predictions = CreateMultiTaskModel(images)  

# Define the loss functions and get the total loss.  
classification_loss = slim.losses.softmax_cross_entropy(scene_predictions, scene_labels)  
sum_of_squares_loss = slim.losses.sum_of_squares(depth_predictions, depth_labels)  

# The following two lines have the same effect:  
total_loss = classification_loss + sum_of_squares_loss  
total_loss = slim.losses.get_total_loss(add_regularization_losses=False)  

In this example, we have two losses which we add by calling slim.losses.softmax_cross_entropy and slim.losses.sum_of_squares. We can obtain the total loss by adding them together (total_loss) or by calling slim.losses.get_total_loss(). How did this work? When you create a loss function via TF-Slim, TF-Slim adds the loss to a special TensorFlow collection of loss functions. This enables you to either manage the total loss manually, or allow TF-Slim to manage them for you.

在這個例子中,我們有兩個損失,分別是通過slim.losses.softmax_cross_entropy和 slim.losses.sum_of_squares得到的。我們既可以通過相加得到total_loss,也可以通過slim.losses.get_total_loss()得到total_loss。這是怎麼做到的呢?當你通過TF-Slim創建一個損失函數時,TF-Slim會把損失加入到一個特殊的Tensorflow的損失函數集合中。這樣你既可以手動管理損失函數,也可以託管給TF-Slim。

What if you want to let TF-Slim manage the losses for you but have a custom loss function?loss_ops.py also has a function that adds this loss to TF-Slims collection. For example:

如果我們有一個自定義的損失函數,現在也想託管給TF-Slim,該怎麼做呢?loss_ops.py也有一個函數可以將這個損失函數加入到TF-Slim集合中。例如

# Load the images and labels.  
images, scene_labels, depth_labels, pose_labels = ...  

# Create the model.  
scene_predictions, depth_predictions, pose_predictions = CreateMultiTaskModel(images)  

# Define the loss functions and get the total loss.  
classification_loss = slim.losses.softmax_cross_entropy(scene_predictions, scene_labels)  
sum_of_squares_loss = slim.losses.sum_of_squares(depth_predictions, depth_labels)  
pose_loss = MyCustomLossFunction(pose_predictions, pose_labels)  
slim.losses.add_loss(pose_loss) # Letting TF-Slim know about the additional loss.  

# The following two ways to compute the total loss are equivalent:  
regularization_loss = tf.add_n(slim.losses.get_regularization_losses())  
total_loss1 = classification_loss + sum_of_squares_loss + pose_loss + regularization_loss  

# (Regularization Loss is included in the total loss by default).  
total_loss2 = slim.losses.get_total_loss()  

In this example, we can again either produce the total loss function manually or let TF-Slim know about the additional loss and let TF-Slim handle the losses.

這個例子中,我們同樣既可以手動管理損失函數,也可以讓TF-Slim知曉這個自定義損失函數,然後託管給TF-Slim。

Training Loop
TF-Slim provides a simple but powerful set of tools for training models found inlearning.py. These include a Train function that repeatedly measures the loss, computes gradients and saves the model to disk, as well as several convenience functions for manipulating gradients. For example, once we’ve specified the model, the loss function and the optimization scheme, we can call slim.learning.create_train_op andslim.learning.train to perform the optimization:

在learning.py中,TF-Slim提供了簡單卻非常強大的訓練模型的工具集。包括Train函數,可以重複地測量損失,計算梯度以及保存模型到磁盤中,還有一些方便的函數用於操作梯度。例如,當我們定義好了模型、損失函數以及優化方式,我們就可以調用slim.learning.create_train_op andslim.learning.train 去執行優化:

g = tf.Graph()  

# Create the model and specify the losses...  
...  

total_loss = slim.losses.get_total_loss()  
optimizer = tf.train.GradientDescentOptimizer(learning_rate)  

# create_train_op ensures that each time we ask for the loss, the update_ops  
# are run and the gradients being computed are applied too.  
train_op = slim.learning.create_train_op(total_loss, optimizer)  
logdir = ... # Where checkpoints are stored.  

slim.learning.train(  
    train_op,  
    logdir,  
    number_of_steps=1000,  
    save_summaries_secs=300,  
    save_interval_secs=600):  

In this example, slim.learning.train is provided with thetrain_op which is used to (a) compute the loss and (b) apply the gradient step.logdir specifies the directory where the checkpoints and event files are stored. We can limit the number of gradient steps taken to any number. In this case, we’ve asked for1000 steps to be taken. Finally, save_summaries_secs=300 indicates that we’ll compute summaries every 5 minutes and save_interval_secs=600 indicates that we’ll save a model checkpoint every 10 minutes.

在該例中,slim.learning.train根據train_op計算損失、應用梯度step。logdir指定了checkpoints和event文件的存儲路徑。我們可以限制梯度step到任何數值。這裏我們採用1000步。最後,save_summaries_secs=300表示每5分鐘計算一次summaries,save_interval_secs=600表示每10分鐘保存一次模型的checkpoint。
Working Example: Training the VGG16 Model
To illustrate this, lets examine the following sample of training the VGG network:

爲了說明,讓我們測試以下訓練VGG的例子:

import tensorflow as tf  

slim = tf.contrib.slim  
vgg = tf.contrib.slim.nets.vgg  

...  

train_log_dir = ...  
if not tf.gfile.Exists(train_log_dir):  
  tf.gfile.MakeDirs(train_log_dir)  

with tf.Graph().as_default():  
  # Set up the data loading:  
  images, labels = ...  

  # Define the model:  
  predictions = vgg.vgg16(images, is_training=True)  

  # Specify the loss function:  
  slim.losses.softmax_cross_entropy(predictions, labels)  

  total_loss = slim.losses.get_total_loss()  
  tf.summary.scalar('losses/total_loss', total_loss)  

  # Specify the optimization scheme:  
  optimizer = tf.train.GradientDescentOptimizer(learning_rate=.001)  

  # create_train_op that ensures that when we evaluate it to get the loss,  
  # the update_ops are done and the gradient updates are computed.  
  train_tensor = slim.learning.create_train_op(total_loss, optimizer)  

  # Actually runs training.  
  slim.learning.train(train_tensor, train_log_dir)  

Fine-Tuning Existing Models
Brief Recap on Restoring Variables from a Checkpoint
對從checkpoint加載variables的簡略概括
After a model has been trained, it can be restored using tf.train.Saver() which restores Variables from a given checkpoint. For many cases,tf.train.Saver() provides a simple mechanism to restore all or just a few variables.

在一個模型訓練完成後,我們可以用tf.train.Saver()通過指定checkpoing加載variables的方式加載這個模型。對於很多情況,tf.train.Saver()提供了一種簡單的機制去加載所有或一些varialbes變量。

# Create some variables.  
v1 = tf.Variable(..., name="v1")  
v2 = tf.Variable(..., name="v2")  
...  
# Add ops to restore all the variables.  
restorer = tf.train.Saver()  

# Add ops to restore some variables.  
restorer = tf.train.Saver([v1, v2])  

# Later, launch the model, use the saver to restore variables from disk, and  
# do some work with the model.  
with tf.Session() as sess:  
  # Restore variables from disk.  
  restorer.restore(sess, "/tmp/model.ckpt")  
  print("Model restored.")  
  # Do some work with the model  


See Restoring Variables and Choosing which Variables to Save and Restore sections of the Variables page for more details.

參閱Variables章中Restoring Variables和Choosing which Variables to Save and Restore 相關部分,獲取更多細節。
Partially Restoring Models
It is often desirable to fine-tune a pre-trained model on an entirely new dataset or even a new task. In these situations, one can use TF-Slim’s helper functions to select a subset of variables to restore:

有時我們希望在一個全新的數據集上或面對一個信息任務方向去微調預訓練模型。在這些情況下,我們可以使用TF-Slim’s的幫助函數去加載模型中變量的一個子集:

# Create some variables.  
v1 = slim.variable(name="v1", ...)  
v2 = slim.variable(name="nested/v2", ...)  
...  

# Get list of variables to restore (which contains only 'v2'). These are all  
# equivalent methods:  
variables_to_restore = slim.get_variables_by_name("v2")  
# or  
variables_to_restore = slim.get_variables_by_suffix("2")  
# or  
variables_to_restore = slim.get_variables(scope="nested")  
# or  
variables_to_restore = slim.get_variables_to_restore(include=["nested"])  
# or  
variables_to_restore = slim.get_variables_to_restore(exclude=["v1"])  

# Create the saver which will be used to restore the variables.  
restorer = tf.train.Saver(variables_to_restore)  

with tf.Session() as sess:  
  # Restore variables from disk.  
  restorer.restore(sess, "/tmp/model.ckpt")  
  print("Model restored.")  
  # Do some work with the model  
  ...  

Restoring models with different variable names
用不同的變量名加載模型
When restoring variables from a checkpoint, theSaverlocates the variable names in a checkpoint file and maps them to variables in the current graph. Above, we created a saver by passing to it a list of variables. In this case, the names of the variables to locate in the checkpoint file were implicitly obtained from each provided variable’s var.op.name.

當從checkpoint加載變量時,Saver先在checkpoint中定位變量名,然後映射到當前圖的變量中。我們也可以通過向saver傳遞一個變量列表來創建saver。這時,在checkpoint文件中用於定位的變量名可以隱式地從各自的var.op.name中獲得。
This works well when the variable names in the checkpoint file match those in the graph. However, sometimes, we want to restore a model from a checkpoint whose variables have different names those in the current graph. In this case,we must provide the Saver a dictionary that maps from each checkpoint variable name to each graph variable. Consider the following example where the checkpoint variables names are obtained via a simple function:

當checkpoint文件中的變量名與當前圖中的變量名完全匹配時,這會運行得很好。但是,有時我們想從一個變量名與與當前圖的變量名不同的checkpoint文件中裝載一個模型。這時,我們必須提供一個saver字典,這個字典對checkpoint中的每個變量和每個圖變量進行了一一映射。請看下面這個例子,checkpoint的變量是通過一個簡單的函數獲得的:

# Assuming than 'conv1/weights' should be restored from 'vgg16/conv1/weights'  
def name_in_checkpoint(var):  
  return 'vgg16/' + var.op.name  

# Assuming than 'conv1/weights' and 'conv1/bias' should be restored from 'conv1/params1' and 'conv1/params2'  
def name_in_checkpoint(var):  
  if "weights" in var.op.name:  
    return var.op.name.replace("weights", "params1")  
  if "bias" in var.op.name:  
    return var.op.name.replace("bias", "params2")  

variables_to_restore = slim.get_model_variables()  
variables_to_restore = {name_in_checkpoint(var):var for var in variables_to_restore}  
restorer = tf.train.Saver(variables_to_restore)  

with tf.Session() as sess:  
  # Restore variables from disk.  
  restorer.restore(sess, "/tmp/model.ckpt")  

Fine-Tuning a Model on a different task
Consider the case where we have a pre-trained VGG16 model. The model was trained on the ImageNet dataset, which has 1000 classes. However, we would like to apply it to the Pascal VOC dataset which has only 20 classes. To do so, we can initialize our new model using the values of the pre-trained model excluding the final layer:

假設我們有一個已經預訓練好的VGG16的模型。這個模型是在擁有1000分類的ImageNet數據集上進行訓練的。但是,現在我們想把它應用在只具有20個分類的Pascal VOC數據集上。爲了能這樣做,我們可以通過利用除最後一些全連接層的其他預訓練模型值來初始化新模型的達到目的:

# Load the Pascal VOC data  
image, label = MyPascalVocDataLoader(...)  
images, labels = tf.train.batch([image, label], batch_size=32)  

# Create the model  
predictions = vgg.vgg_16(images)  

train_op = slim.learning.create_train_op(...)  

# Specify where the Model, trained on ImageNet, was saved.  
model_path = '/path/to/pre_trained_on_imagenet.checkpoint'  
<a target="_blank" href="https://www.tensorflow.org/code/tensorflow/contrib/metrics/python/ops/metric_ops.py">metric_ops.py</a>  
# Specify where the new model will live:  
log_dir =   

from_checkpoint_  

'/path/to/my_pascal_model_dir/'  

# Restore only the convolutional layers:  
variables_to_restore = slim.get_variables_to_restore(exclude=['fc6', 'fc7', 'fc8'])  
init_fn = assign_from_checkpoint_fn(model_path, variables_to_restore)  

# Start training.  
slim.learning.train(train_op, log_dir, init_fn=init_fn)  

Evaluating Models.
Once we’ve trained a model (or even while the model is busy training) we’d like to see how well the model performs in practice. This is accomplished by picking a set of evaluation metrics, which will grade the models performance, and the evaluation code which actually loads the data, performs inference, compares the results to the ground truth and records the evaluation scores. This step may be performed once or repeated periodically.

一旦我們訓練好了一個模型(或者模型還在訓練中),我們想看一下模型在實際中性能如何。這可以通過獲取一系列表徵模型性能的評估指標來實現,評估代碼一般會加載數據,執行前向傳播,和ground truth進行比較並記錄評估分數。這個步驟可能執行一次,也可能週期性地執行。
Metrics
We define a metric to be a performance measure that is not a loss function(losses are directly optimized during training), but which we are still interested in for the purpose of evaluating our model.For example, we might want to minimize log loss, but our metrics of interest might be F1 score (test accuracy), or Intersection Over Union score (which are not differentiable, and therefore cannot be used as losses).

比如我們定義了一個不是損失函數的性能度量指標(損失在訓練過程中進行直接優化),而這個指標出於評估模型的目的我們還非常感興趣。比如說我們想最小化log損失,但是我們感興趣的指標可能是F1 score(測試準確率),或者IoU分數(這個指標不可微,因此不能作爲損失)。
TF-Slim provides a set of metric operations that makes evaluating models easy. Abstractly, computing the value of a metric can be divided into three parts:

Initialization: initialize the variables used to compute the metrics.
Aggregation: perform operations (sums, etc) used to compute the metrics.
Finalization: (optionally) perform any final operation to compute metric values. For example, computing means, mins, maxes, etc.
TF-Slim提供了一系列指標操作符,它們可以使模型評估更簡單。抽象來講,計算一個指標值可以分爲3步:

   1. 初始化:初始化用於計算指標的變量。

   2. 聚合:執行用於計算指標的運算流程(比如sum)。

   3. 收尾:(可選)執行其他用於計算指標值的操作。例如,計算mean、min、max等。

For example, to compute mean_absolute_error, two variables, a count and total variable are initialized to zero. During aggregation, we observed some set of predictions and labels, compute their absolute differences and add the total to total. Each time we observe another value,count is incremented. Finally, duringfinalization,total is divided by count to obtain the mean.

例如,爲了計算絕對平均誤差,一個count變量和一個total變量需要初始化爲0. 在聚合階段,我們可以觀察到一系列預測值及標籤,計算他們差的絕對值,並加到total中。每次循環,count變量自加1。最後,在收尾階段,total除以count就得到了mean。
The following example demonstrates the API for declaring metrics. Because metrics are often evaluated on a test set which is different from the training set (upon which the loss is computed), we’ll assume we’re using test data:

下面的例子演示了定義指標的API。因爲指標通常是在測試集上計算,而不是訓練集(訓練集上是用於計算loss的),我們假設我們在使用測試集:

images, labels = LoadTestData(...)  
predictions = MyModel(images)  

mae_value_op, mae_update_op = slim.metrics.streaming_mean_absolute_error(predictions, labels)  
mre_value_op, mre_update_op = slim.metrics.streaming_mean_relative_error(predictions, labels)  
pl_value_op, pl_update_op = slim.metrics.percentage_less(mean_relative_errors, 0.3)  

As the example illustrates, the creation of a metric returns two values:a value_op and anupdate_op. The value_op is an idempotent operation that returns the current value of the metric. The update_op is an operation that performs the aggregation step mentioned above as well as returning the value of the metric.

如上例所示,指標的創建會返回兩個值,一個value_op和一個update_op。value_op表示和當前指標值冪等的操作。update_op是上文提到的執行聚合步驟並返回指標值的操作符。
Keeping track of each value_op andupdate_op can be laborious. To deal with this, TF-Slim provides two convenience functions:

跟蹤每個value_op和update_op是非常費勁的。爲了解決這個問題,TF-Slim提供了兩個方便的函數:

# Aggregates the value and update ops in two lists:  
value_ops, update_ops = slim.metrics.aggregate_metrics(  
    slim.metrics.streaming_mean_absolute_error(predictions, labels),  
    slim.metrics.streaming_mean_squared_error(predictions, labels))  

# Aggregates the value and update ops in two dictionaries:  
names_to_values, names_to_updates = slim.metrics.aggregate_metric_map({  
    "eval/mean_absolute_error": slim.metrics.streaming_mean_absolute_error(predictions, labels),  
    "eval/mean_squared_error": slim.metrics.streaming_mean_squared_error(predictions, labels),  
})  

Working example: Tracking Multiple Metrics
Putting it all together:

把上面講到的我們整合在一起:

import tensorflow as tf  

slim = tf.contrib.slim  
vgg = tf.contrib.slim.nets.vgg  


# Load the data  
images, labels = load_data(...)  

# Define the network  
predictions = vgg.vgg_16(images)  

# Choose the metrics to compute:  
names_to_values, names_to_updates = slim.metrics.aggregate_metric_map({  
    "eval/mean_absolute_error": slim.metrics.streaming_mean_absolute_error(predictions, labels),  
    "eval/mean_squared_error": slim.metrics.streaming_mean_squared_error(predictions, labels),  
})  

# Evaluate the model using 1000 batches of data:  
num_batches = 1000  

with tf.Session() as sess:  
  sess.run(tf.global_variables_initializer())  
  sess.run(tf.local_variables_initializer())  

  for batch_id in range(num_batches):  
    sess.run(names_to_updates.values())  

  metric_values = sess.run(names_to_values.values())  
  for metric, value in zip(names_to_values.keys(), metric_values):  
    print('Metric %s has value: %f' % (metric, value))  

Note that metric_ops.py can be used in isolation without using either layers.py or loss_ops.py

注意,metric_ops.py可以在沒有layers.py和loss_ops.py的情況下獨立使用。
Evaluation Loop
TF-Slim provides an evaluation module(evaluation.py),which contains helper functions for writing model evaluation scripts using metrics from the metric_ops.py module. These include a function for periodically running evaluations,evaluating metrics over batches of data and printing and summarizing metric results. For example:

TF-Slim提供了一個評估模塊(evaluation.py),這個模塊包含了一些利用來自metric_ops.py模塊的指標寫模型評估腳本的幫助函數。其中包含一個可以週期運行評估,評估數據batch之間的指標、打印並總結指標結果的函數。例如:

import tensorflow as tf

slim = tf.contrib.slim

# Load the data
images, labels = load_data(...)

# Define the network
predictions = MyModel(images)

# Choose the metrics to compute:
names_to_values, names_to_updates = slim.metrics.aggregate_metric_map({
    'accuracy': slim.metrics.accuracy(predictions, labels),
    'precision': slim.metrics.precision(predictions, labels),
    'recall': slim.metrics.recall(mean_relative_errors, 0.3),
})

# Create the summary ops such that they also print out to std output:
summary_ops = []
for metric_name, metric_value in names_to_values.iteritems():
  op = tf.summary.scalar(metric_name, metric_value)
  op = tf.Print(op, [metric_value], metric_name)
  summary_ops.append(op)

num_examples = 10000
batch_size = 32
num_batches = math.ceil(num_examples / float(batch_size))

# Setup the global step.
slim.get_or_create_global_step()

output_dir = ... # Where the summaries are stored.
eval_interval_secs = ... # How often to run the evaluation.
slim.evaluation.evaluation_loop(
    'local',
    checkpoint_dir,
    log_dir,
    num_evals=num_batches,
    eval_op=names_to_updates.values(),
    summary_op=tf.summary.merge(summary_ops),
    eval_interval_secs=eval_interval_secs)
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章