python 深度學習框架 Chainer 介紹

基於Python的深度學習

Chainer 介紹

Chainer 介紹

Posted by 徐志平 on December 14, 2017

Chainer 介紹

這裏是 Chainer 教程的第一部分。 在此部分中,您將學習如下內容:

  • 現行框架的優缺點以及我們爲什麼開發 Chainer
  • 前向以及反向計算的簡單的例子
  • 連接的使用以及梯度計算
  • chains 的構建(即. 大多數框架所指的“模型”)
  • 參數優化
  • 連接和優化器的串行化

讀完此部分,您將能夠:

  • 計算一些算式的梯度
  • 用 Chainer 寫一個多層感知器

核心概念

正如前文所述, Chainer 是一個柔性的神經網絡框架。我們的主要目標就是柔性,使得我們能夠簡單直觀的寫出複雜的網絡。

當下已有的深度學習框架使用的是“定義後運行”機制。即意味着,首先定義並且固化一個網絡,再周而復始地饋入小批量數據進行訓練。由於網絡是在任何前向、反向計算前靜態定義的,所有的邏輯作爲數據必須事先嵌入網絡中。 意味着,在諸如Caffe這樣的框架中通過聲明的方法定義網絡結構。(注:可以使用torch.nn, 基於 Theano框架, 以及 TensorFlow 的命令語句定義一個靜態網絡)

邊定義邊運行

Chainer 對應地採用了一種叫做 “邊定義邊運行” 的機制, 即, 網絡可以在實際進行前向計算的時候同時被定義。 更加準確的說, Chainer 存儲的是計算的歷史結果而不是計算邏輯。這個策略使我們能夠充分利用Python中編程邏輯的力量。例如,Chainer不需要任何魔法就可以將條件和循環引入到網絡定義中。 邊定義邊運行是Chainer的核心概念。 我們將在本教程中展示如何動態定義網絡。

這個策略也使編寫多GPU並行化變得容易,因爲邏輯更接近於網絡操作。我們將在本教程後面的章節中回顧這些設施。

Chainer 將網絡表示爲計算圖上的執行路徑。計算圖是一系列函數應用,因此它可以用多個Function對象來描述。當這個Function是一個神經網絡層時,功能的參數將通過訓練來更新。因此,該函數需要在內部保留可訓練的參數,因此Chainer具有Link類,它可以在類的對象上保存可訓練參數。在Link對象中執行的函數的參數被表示爲Variable對象。 簡言之,LinkFunction之間的區別在於它是否包含可訓練參數。 神經網絡模型通常被描述爲一系列LinkFunction

您可以通過動態“鏈接”各種LinkFunction來構建計算圖來定義Chain。在框架中,通過運行鏈接圖來定義網絡,因此名稱是Chainer。

在本教程的示例代碼中,我們假定爲了簡單起見,已經預先導入了以下語句:

import numpy as np
import chainer
from chainer import cuda, Function, gradient_check, report, training, utils, Variable
from chainer import datasets, iterators, optimizers, serializers
from chainer import Link, Chain, ChainList
import chainer.functions as F
import chainer.links as L
from chainer.training import extensions

這些導入廣泛出現在Chainer代碼和例子中。爲了簡單起見,我們在本教程中省略了這些導入。

前向/反向計算

如上所述,Chainer使用“邊定義邊運行”方案,因此前向計算本身即定義了網絡。爲了開始前向計算,我們必須將輸入數組設置爲一個Variable對象。這裏我們從一個簡單的ndarray開始,只有一個元素:

x_data = np.array([5], dtype=np.float32)
x = Variable(x_data)

Variable 對象具有基本的算術運算符。爲了計算 y=x2−2x+1y=x2−2x+1, 只需寫:

y = x**2 - 2 * x + 1

結果y也是一個Variable對象,其值可以通過訪問data屬性來提取:

y.data
array([ 16.], dtype=float32)

y所持有的不僅是結果的數值。它也保持計算的歷史(即計算圖),其能夠計算其差分。這是通過調用它的backward()方法完成的:

y.backward()

其運行錯誤反向傳播(也稱爲反向傳播或反向模式自動差分)。然後,計算梯度並將其存儲在輸入變量x的grad屬性中:

x.grad
array([ 8.], dtype=float32)

我們也可以計算中間變量的梯度。請注意,Chainer默認情況下會釋放中間變量的梯度數組以提高內存效率。爲了保留梯度信息,請將retain_grad參數傳遞給backward方法:

z = 2*x
y = x**2 - z + 1
y.backward(retain_grad=True)
z.grad
array([-1.], dtype=float32)

否則,z.grad將爲None,如下所示:

z = 2*x
y = x**2 - z + 1
y.backward()
z.grad
z.grad is None
True

所有這些計算都很容易推廣到多元素數組輸入。請注意,如果我們想從一個包含多元素數組的變量開始向後計算,我們必須手動設置初始錯誤。 因爲當一個變量的size(這意味着數組中元素的個數)是1時,它被認爲是一個表示損失值的變量對象,所以變量的grad屬性被自動填充爲1。 另一方面,當一個變量的大小大於1時,grad屬性保持爲None,並且在運行backward()之前需要明確地設置初始錯誤。這可以簡單地通過設置輸出變量的grad屬性來完成,如下所示:

x = Variable(np.array([[1, 2, 3], [4, 5, 6]], dtype=np.float32))
y = x**2 - 2*x + 1
y.grad = np.ones((2, 3), dtype=np.float32)
y.backward()
x.grad
array([[  0.,   2.,   4.],
       [  6.,   8.,  10.]], dtype=float32)

functions模塊中定義了許多采用Variable對象的函數。您可以將它們結合起來,實現具有自動後向計算的複雜功能.

連接

爲了編寫神經網絡,我們必須將函數與參數相結合,並優化參數。你可以使用連接來做到這一點。Link是保存參數(即優化目標)的對象。

最基本的是像常規函數一樣的連接。我們將介紹更高層次的連接,但是在這裏將連接看作簡化的帶有參數的函數。

最經常使用的連接之一是Linear 連接(也稱爲完全連接層或仿射變換)。它代表一個數學函數 f(x)=Wx+bf(x)=Wx+b ,其中W爲矩陣和b 爲矢量參數。這個連接對應於linear(),它接受xWb 作爲參數。從三維空間到二維空間的線性連接由以下行定義:

f = L.Linear(3, 2)

大多數函數和鏈接只接受小批量輸入,其中輸入數組的第一個維度被視爲批量維度。在上面的線性連接情況下,輸入必須具有(N,3)的形狀,其中N是最小批量大小。

連接的參數被存儲爲屬性。每個參數都是Variable的一個實例。在Linear連接的情況下,存儲兩個參數Wb。默認情況下,矩陣W是隨機初始化的,而向量b是用零初始化的。

f.W.data
array([[ 0.19792122,  0.29951876, -0.31833425],
       [-0.59501284, -0.65519476, -0.00605371]], dtype=float32)
f.b.data
array([ 0.,  0.], dtype=float32)

Linear 連接的一個實例就像一個通常的函數:

x = Variable(np.array([[1, 2, 3], [4, 5, 6]], dtype=np.float32))
y = f(x)
y.data
array([[-0.15804404, -1.9235636 ],
       [ 0.37927318, -5.69234705]], dtype=float32)

有時計算輸入空間的維數很麻煩。線性連接和一些(反)卷積連接可以在實例化時省略輸入維度,並從第一個小批量中推斷出輸入維度來。

例如,以下行創建一個輸出維度爲兩個的線性連接:

g = L.Linear(2)

如果我們輸入一個小批量的形狀爲(N,M),則輸入維數將被推斷爲M,這意味着g.W將是2×M矩陣。 請注意,它的參數在第一個小批處理中以懶惰的方式初始化。因此,如果沒有數據放入連接,則f不具有W屬性。

參數的梯度由backward()方法計算。請注意,梯度是由方法累積而不是覆蓋。所以首先你必須清除梯度來更新計算。可以通過調用cleargrads()方法來完成。

x = Variable(np.array([[1, 2, 3], [4, 5, 6]], dtype=np.float32))
g = L.Linear(2)
p=g(x)
p
variable([[-2.64461255,  2.90179563],
          [-6.81166267,  4.94405651]])
g.cleargrads()
g.grad = np.ones((2, 2), dtype=np.float32)
g.W.grad
g.b.grad

基於 chain 寫一個模型

大多數神經網絡體系結構包含多個連接。例如,多層感知器由多個線性層組成。我們可以通過組合多個連接來編寫具有可訓練參數的複雜過程:

l1 = L.Linear(4, 3)
l2 = L.Linear(3, 2)

def my_forward(x):
    h = l1(x)
    return l2(h)

這裏的L表示links模塊。以這種方式定義參數的過程很難重用。更多Pythonic的方式是將連接和程序組合成一個類:

class MyProc(object):
    def __init__(self):
        self.l1 = L.Linear(4, 3)
        self.l2 = L.Linear(3, 2)

    def forward(self, x):
        h = self.l1(x)
        return self.l2(h)

爲了使其更加可重用,我們希望支持參數管理,CPU / GPU遷移,強大而靈活的保存/加載功能等。這些功能都由Chainer中的Chain類支持。那麼,我們要做的就是將上面的類定義爲 Chain 的子類:

class MyChain(Chain):
    def __init__(self):
        super(MyChain, self).__init__()
        with self.init_scope():
            self.l1 = L.Linear(4, 3)
            self.l2 = L.Linear(3, 2)
            
    def __call__(self, x):
        h = self.l1(x)
        return self.l2(h)

它顯示了一個複雜的連接是如何通過更連接的鏈接構建的。諸如l1l2被稱爲MyChain的子連接。注意,Chain本身繼承自Link。這意味着我們可以定義更復雜的連接,將MyChain對象作爲子連接。

我們經常通過__call__運算符定義一個前向連接。這樣的連接和Chains是可調用的,並且像常規函數和變量一樣。

另一種定義chain的方法是使用ChainList類,它的行爲類似於連接列表:

class MyChain2(ChainList):
    def __init__(self):
        super(MyChain2, self).__init__(
            L.Linear(4, 3),
            L.Linear(3, 2),
        )

    def __call__(self, x):
        h = self[0](x)
        return self[1](h)

ChainList可以方便地使用任意數量的連接,但是如果連接的數量固定且與上述情況相同,則建議使用Chain類作爲基類。

優化器

爲了獲得良好的參數值,我們必須通過優化器類來優化它們。它在給定的連接上運行數值優化算法。許多算法在優化器模塊中實現。這裏我們使用最簡單的稱爲隨機梯度下降(SGD):

model = MyChain()
optimizer = optimizers.SGD()
optimizer.setup(model)

setup()方法針對給定的連接準備對應的優化器。

一些參數/梯度操作,例如權重衰減和梯度剪切,可以通過設置鉤子函數到優化器來完成。 鉤子函數在梯度計算之後和實際更新參數之前調用。例如,我們可以通過預先運行下一行來設置權重衰減正則化:

 optimizer.add_hook(chainer.optimizer.WeightDecay(0.0005))

當然,你可以編寫自己的鉤子函數。它應該是一個函數或一個可調用的對象,以優化器爲參數。

有兩種使用優化器的方法。一個是通過訓練器使用它,我們將在下面的部分中看到。另一種方式是直接使用它。我們在這裏回顧後一種情況。如果您有興趣以簡單的方式使用優化器,請跳過本節並轉到下一節。

還有兩種直接使用優化器的方法。一個是手動計算梯度,然後調用沒有參數的 update()方法。不要忘記事先清除梯度!

x = np.random.uniform(-1, 1, (2, 4)).astype('f')
model.cleargrads()
# compute gradient here...
loss = F.sum(model(chainer.Variable(x)))
loss.backward()
optimizer.update()

另一種方法是將損失函數傳遞給update()方法。在這種情況下,cleargrads() 會被update方法自動調用,所以用戶不必手動調用它。

def lossfun(arg1, arg2):
    # calculate loss
    loss = F.sum(model(arg1 - arg2))
    return loss
arg1 = np.random.uniform(-1, 1, (2, 4)).astype('f')
arg2 = np.random.uniform(-1, 1, (2, 4)).astype('f')
optimizer.update(lossfun, chainer.Variable(arg1), chainer.Variable(arg2))

訓練器

當我們想要訓練神經網絡時,我們必須運行訓練循環多次更新參數。典型的訓練循環包括以下過程:

  1. 對訓練數據集進行迭代
  2. 提取小批量的預處理
  3. 神經網絡的前向/後向計算
  4. 參數更新
  5. 評估驗證數據集上的當前參數
  6. 記錄和打印中間結果

Chainer提供了一個簡單而強大的方法來使寫這樣的訓練過程變得容易。訓練循環抽象主要由兩部分組成:

  • 數據集抽象。它在上面的列表中實現了1和2。核心組件在數據集模塊中定義。數據集和迭代器模塊中還有許多數據集和迭代器的實現。

  • 訓練器。它在上面的列表中實現3,4,5和6。整個程序由Trainer執行。更新參數(3和4)的方式由Updater定義,可以自由定製。 5和6由Extension的實例來實現,它將一個額外的過程附加到訓練循環中。用戶可以通過添加擴展來自由定製訓練程序。用戶也可以實現自己的擴展。

序列化器

在繼續第一個例子之前,我們介紹Serializer,這是本頁中描述的最後一個核心功能。序列化器是一個簡單的接口來序列化或反序列化一個對象。連接,優化器和訓練器都支持序列化。

序列化器模塊中定義了具體的序列化器。它支持NumPy NPZ和HDF5格式。

例如,我們可以通過serializers.save_npz()函數將連接對象序列化成NPZ文件:

serializers.save_npz('my.model', model)

它將模型的參數以NPZ格式保存到文件“my.model”中。保存的模型可以被serializers.load_npz()函數讀取:

serializers.load_npz('my.model', model)

請注意,只有參數和持久值由該序列化代碼序列化。其他屬性不會自動保存。您可以通過Link.add_persistent()方法將數組,標量或任何可序列化的對象註冊爲持久值。註冊的值可以通過傳遞給add_persistent方法的名稱的屬性來訪問。

優化器的狀態也可以通過相同的函數來保存:

serializers.save_npz('my.state', optimizer)
serializers.load_npz('my.state', optimizer)

請注意,優化器的序列化只保存其內部狀態,包括迭代次數,MomentumSGD的動量向量等。它不保存目標連接的參數和永久值。我們必須明確地保存與優化器的目標連接,從保存狀態恢復優化。

如果安裝了h5py軟件包,則支持HDF5格式。 HDF5格式的序列化和反序列化與NPZ格式的序列化和反序列化幾乎相同;只需用save_hdf5()和load_hdf5()分別替換save_npz()和load_npz()即可。

例子:基於MNIST的多層感知器

現在,您可以使用多層感知器(MLP)來解決多類分類任務。我們使用手寫數字數據集稱爲MNIST,這是機器學習中長期使用的事實上的“hello world”示例之一。這個MNIST例子也可以在官方倉庫的examples / mnist目錄中找到。我們演示如何使用訓練器來構建和運行本節中的訓練循環。

我們首先必須準備MNIST數據集。 MNIST數據集由70,000個尺寸爲28×28(即784個像素)的灰度圖像和相應的數字標籤組成。數據集默認分爲6萬個訓練圖像和10,000個測試圖像。我們可以通過datasets.get_mnist()獲得矢量化版本(即一組784維向量)。

train, test = datasets.get_mnist()

此代碼自動下載MNIST數據集並將NumPy數組保存到 $(HOME)/.chainer 目錄中。返回的訓練集和測試集可以看作圖像標籤配對的列表(嚴格地說,它們是TupleDataset的實例)。

我們還必須定義如何迭代這些數據集。我們想要在數據集的每次掃描開始時對每個epoch的訓練數據集進行重新洗牌。在這種情況下,我們可以使用iterators.SerialIterator

train_iter = iterators.SerialIterator(train, batch_size=100, shuffle=True)

另一方面,我們不必洗牌測試數據集。在這種情況下,我們可以通過shuffle = False來禁止混洗。當底層數據集支持快速切片時,它使迭代速度更快。

test_iter = iterators.SerialIterator(test, batch_size=100, repeat=False, shuffle=False)

當所有的例子被訪問時,我們停止迭代通過設定 repeat=False 。測試/驗證數據集通常需要此選項;沒有這個選項,迭代進入一個無限循環。

接下來,我們定義架構。我們使用一個簡單的三層網絡,每層100個單元。

class MLP(Chain):
    def __init__(self, n_units, n_out):
        super(MLP, self).__init__()
        with self.init_scope():
            # the size of the inputs to each layer will be inferred
            self.l1 = L.Linear(None, n_units)  # n_in -> n_units
            self.l2 = L.Linear(None, n_units)  # n_units -> n_units
            self.l3 = L.Linear(None, n_out)    # n_units -> n_out

    def __call__(self, x):
        h1 = F.relu(self.l1(x))
        h2 = F.relu(self.l2(h1))
        y = self.l3(h2)
        return y

該鏈接使用relu()作爲激活函數。請注意,“l3”鏈接是最終的全連接層,其輸出對應於十個數字的分數。

爲了計算損失值或評估預測的準確性,我們在上面的MLP連接的基礎上定義一個分類器連接:

class Classifier(Chain):
    def __init__(self, predictor):
        super(Classifier, self).__init__()
        with self.init_scope():
            self.predictor = predictor

    def __call__(self, x, t):
        y = self.predictor(x)
        loss = F.softmax_cross_entropy(y, t)
        accuracy = F.accuracy(y, t)
        report({'loss': loss, 'accuracy': accuracy}, self)
        return loss

這個分類器類計算準確性和損失,並返回損失值。參數對x和t對應於數據集中的每個示例(圖像和標籤的元組)。 softmax_cross_entropy()計算給定預測和基準真實標籤的損失值。 accuracy() 計算預測準確度。我們可以爲分類器的一個實例設置任意的預測器連接。

report() 函數向訓練器報告損失和準確度。收集訓練統計信息的具體機制參見 Reporter. 您也可以採用類似的方式收集其他類型的觀測值,如激活統計。

請注意,類似上面的分類器的類被定義爲chainer.links.Classifier。因此,我們將使用此預定義的Classifier連接而不是使用上面的示例。

model = L.Classifier(MLP(100, 10))  # the input size, 784, is inferred
optimizer = optimizers.SGD()
optimizer.setup(model)

現在我們可以建立一個訓練器對象。

updater = training.StandardUpdater(train_iter, optimizer)
trainer = training.Trainer(updater, (20, 'epoch'), out='result')

第二個參數(20,’epoch’)表示訓練的持續時間。我們可以使用epoch或迭代作爲單位。在這種情況下,我們通過遍歷訓練集20次來訓練多層感知器。

爲了調用訓練循環,我們只需調用run()方法。

這個方法執行整個訓練序列。

上面的代碼只是優化了參數。在大多數情況下,我們想看看培訓的進展情況,我們可以在調用run方法之前使用擴展插入。

trainer.extend(extensions.Evaluator(test_iter, model))
trainer.extend(extensions.LogReport())
trainer.extend(extensions.PrintReport(['epoch', 'main/accuracy', 'validation/main/accuracy']))
trainer.extend(extensions.ProgressBar())
trainer.run()  
epoch       main/accuracy  validation/main/accuracy
[J     total [..................................................]  0.83%
this epoch [########..........................................] 16.67%
       100 iter, 0 epoch / 20 epochs
       inf iters/sec. Estimated time to finish: 0:00:00.
[4A[J     total [..................................................]  1.67%
this epoch [################..................................] 33.33%
       200 iter, 0 epoch / 20 epochs
    270.19 iters/sec. Estimated time to finish: 0:00:43.672168.
[4A[J     total [#.................................................]  2.50%
this epoch [#########################.........................] 50.00%
       300 iter, 0 epoch / 20 epochs
    271.99 iters/sec. Estimated time to finish: 0:00:43.017048.
[4A[J     total [#.................................................]  3.33%
this epoch [#################################.................] 66.67%
       400 iter, 0 epoch / 20 epochs
    274.82 iters/sec. Estimated time to finish: 0:00:42.209075.
[4A[J     total [##................................................]  4.17%
this epoch [#########################################.........] 83.33%
       500 iter, 0 epoch / 20 epochs
    275.19 iters/sec. Estimated time to finish: 0:00:41.789476.
[4A[J1           0.6581         0.8475                    
[J     total [##................................................]  5.00%
this epoch [..................................................]  0.00%
       600 iter, 1 epoch / 20 epochs
    250.26 iters/sec. Estimated time to finish: 0:00:45.553447.
[4A[J     total [##................................................]  5.83%
this epoch [########..........................................] 16.67%
       700 iter, 1 epoch / 20 epochs
    251.78 iters/sec. Estimated time to finish: 0:00:44.879872.
[4A[J     total [###...............................................]  6.67%
this epoch [################..................................] 33.33%
       800 iter, 1 epoch / 20 epochs
    253.07 iters/sec. Estimated time to finish: 0:00:44.257362.
[4A[J     total [###...............................................]  7.50%
this epoch [#########################.........................] 50.00%
       900 iter, 1 epoch / 20 epochs
    253.97 iters/sec. Estimated time to finish: 0:00:43.706513.
[4A[J     total [####..............................................]  8.33%
this epoch [#################################.................] 66.67%
      1000 iter, 1 epoch / 20 epochs
    255.94 iters/sec. Estimated time to finish: 0:00:42.979372.
[4A[J     total [####..............................................]  9.17%
this epoch [#########################################.........] 83.33%
      1100 iter, 1 epoch / 20 epochs
    257.61 iters/sec. Estimated time to finish: 0:00:42.311793.
[4A[J2           0.868483       0.8922                    
[J     total [#####.............................................] 10.00%
this epoch [..................................................]  0.00%
      1200 iter, 2 epoch / 20 epochs
    250.02 iters/sec. Estimated time to finish: 0:00:43.196043.
[4A[J     total [#####.............................................] 10.83%
this epoch [########..........................................] 16.67%
      1300 iter, 2 epoch / 20 epochs
    250.73 iters/sec. Estimated time to finish: 0:00:42.674737.
[4A[J     total [#####.............................................] 11.67%
this epoch [################..................................] 33.33%
      1400 iter, 2 epoch / 20 epochs
    250.76 iters/sec. Estimated time to finish: 0:00:42.271780.
[4A[J     total [######............................................] 12.50%
this epoch [#########################.........................] 50.00%
      1500 iter, 2 epoch / 20 epochs
    250.66 iters/sec. Estimated time to finish: 0:00:41.889907.
[4A[J     total [######............................................] 13.33%
this epoch [#################################.................] 66.67%
      1600 iter, 2 epoch / 20 epochs
    250.63 iters/sec. Estimated time to finish: 0:00:41.494966.
[4A[J     total [#######...........................................] 14.17%
this epoch [#########################################.........] 83.33%
      1700 iter, 2 epoch / 20 epochs
     250.3 iters/sec. Estimated time to finish: 0:00:41.150503.
[4A[J3           0.893583       0.9065                    
[J     total [#######...........................................] 15.00%
this epoch [..................................................]  0.00%
      1800 iter, 3 epoch / 20 epochs
    245.03 iters/sec. Estimated time to finish: 0:00:41.627412.
[4A[J     total [#######...........................................] 15.83%
this epoch [########..........................................] 16.67%
      1900 iter, 3 epoch / 20 epochs
    246.29 iters/sec. Estimated time to finish: 0:00:41.007745.
[4A[J     total [########..........................................] 16.67%
this epoch [################..................................] 33.33%
      2000 iter, 3 epoch / 20 epochs
    246.63 iters/sec. Estimated time to finish: 0:00:40.547184.
[4A[J     total [########..........................................] 17.50%
this epoch [#########################.........................] 50.00%
      2100 iter, 3 epoch / 20 epochs
    247.22 iters/sec. Estimated time to finish: 0:00:40.045529.
[4A[J     total [#########.........................................] 18.33%
this epoch [#################################.................] 66.67%
      2200 iter, 3 epoch / 20 epochs
    248.21 iters/sec. Estimated time to finish: 0:00:39.482367.
[4A[J     total [#########.........................................] 19.17%
this epoch [#########################################.........] 83.33%
      2300 iter, 3 epoch / 20 epochs
    248.73 iters/sec. Estimated time to finish: 0:00:38.997955.
[4A[J4           0.90485        0.9154                    
[J     total [##########........................................] 20.00%
this epoch [..................................................]  0.00%
      2400 iter, 4 epoch / 20 epochs
    244.21 iters/sec. Estimated time to finish: 0:00:39.309754.
[4A[J     total [##########........................................] 20.83%
this epoch [########..........................................] 16.67%
      2500 iter, 4 epoch / 20 epochs
    244.55 iters/sec. Estimated time to finish: 0:00:38.847329.
[4A[J     total [##########........................................] 21.67%
this epoch [################..................................] 33.33%
      2600 iter, 4 epoch / 20 epochs
    245.78 iters/sec. Estimated time to finish: 0:00:38.245938.
[4A[J     total [###########.......................................] 22.50%
this epoch [#########################.........................] 50.00%
      2700 iter, 4 epoch / 20 epochs
    246.89 iters/sec. Estimated time to finish: 0:00:37.668330.
[4A[J     total [###########.......................................] 23.33%
this epoch [#################################.................] 66.67%
      2800 iter, 4 epoch / 20 epochs
    247.85 iters/sec. Estimated time to finish: 0:00:37.119132.
[4A[J     total [############......................................] 24.17%
this epoch [#########################################.........] 83.33%
      2900 iter, 4 epoch / 20 epochs
    248.84 iters/sec. Estimated time to finish: 0:00:36.568961.
[4A[J5           0.9128         0.9222                    
[J     total [############......................................] 25.00%
this epoch [..................................................]  0.00%
      3000 iter, 5 epoch / 20 epochs
    246.32 iters/sec. Estimated time to finish: 0:00:36.537719.
[4A[J     total [############......................................] 25.83%
this epoch [########..........................................] 16.67%
      3100 iter, 5 epoch / 20 epochs
    247.27 iters/sec. Estimated time to finish: 0:00:35.993611.
[4A[J     total [#############.....................................] 26.67%
this epoch [################..................................] 33.33%
      3200 iter, 5 epoch / 20 epochs
    247.64 iters/sec. Estimated time to finish: 0:00:35.535495.
[4A[J     total [#############.....................................] 27.50%
this epoch [#########################.........................] 50.00%
      3300 iter, 5 epoch / 20 epochs
    248.02 iters/sec. Estimated time to finish: 0:00:35.078297.
[4A[J     total [##############....................................] 28.33%
this epoch [#################################.................] 66.67%
      3400 iter, 5 epoch / 20 epochs
     248.3 iters/sec. Estimated time to finish: 0:00:34.635942.
[4A[J     total [##############....................................] 29.17%
this epoch [#########################################.........] 83.33%
      3500 iter, 5 epoch / 20 epochs
    248.35 iters/sec. Estimated time to finish: 0:00:34.225545.
[4A[J6           0.9182         0.9251                    
[J     total [###############...................................] 30.00%
this epoch [..................................................]  0.00%
      3600 iter, 6 epoch / 20 epochs
    245.49 iters/sec. Estimated time to finish: 0:00:34.217710.
[4A[J     total [###############...................................] 30.83%
this epoch [########..........................................] 16.67%
      3700 iter, 6 epoch / 20 epochs
    245.88 iters/sec. Estimated time to finish: 0:00:33.755860.
[4A[J     total [###############...................................] 31.67%
this epoch [################..................................] 33.33%
      3800 iter, 6 epoch / 20 epochs
     245.9 iters/sec. Estimated time to finish: 0:00:33.346716.
[4A[J     total [################..................................] 32.50%
this epoch [#########################.........................] 50.00%
      3900 iter, 6 epoch / 20 epochs
    245.96 iters/sec. Estimated time to finish: 0:00:32.931534.
[4A[J     total [################..................................] 33.33%
this epoch [#################################.................] 66.67%
      4000 iter, 6 epoch / 20 epochs
    245.99 iters/sec. Estimated time to finish: 0:00:32.521949.
[4A[J     total [#################.................................] 34.17%
this epoch [#########################################.........] 83.33%
      4100 iter, 6 epoch / 20 epochs
    246.12 iters/sec. Estimated time to finish: 0:00:32.098613.
[4A[J7           0.923683       0.9281                    
[J     total [#################.................................] 35.00%
this epoch [..................................................]  0.00%
      4200 iter, 7 epoch / 20 epochs
    244.37 iters/sec. Estimated time to finish: 0:00:31.918388.
[4A[J     total [#################.................................] 35.83%
this epoch [########..........................................] 16.67%
      4300 iter, 7 epoch / 20 epochs
    244.24 iters/sec. Estimated time to finish: 0:00:31.526645.
[4A[J     total [##################................................] 36.67%
this epoch [################..................................] 33.33%
      4400 iter, 7 epoch / 20 epochs
     244.7 iters/sec. Estimated time to finish: 0:00:31.058855.
[4A[J     total [##################................................] 37.50%
this epoch [#########################.........................] 50.00%
      4500 iter, 7 epoch / 20 epochs
    245.22 iters/sec. Estimated time to finish: 0:00:30.584594.
[4A[J     total [###################...............................] 38.33%
this epoch [#################################.................] 66.67%
      4600 iter, 7 epoch / 20 epochs
    245.84 iters/sec. Estimated time to finish: 0:00:30.100470.
[4A[J     total [###################...............................] 39.17%
this epoch [#########################################.........] 83.33%
      4700 iter, 7 epoch / 20 epochs
     246.3 iters/sec. Estimated time to finish: 0:00:29.638363.
[4A[J8           0.927233       0.9312                    
[J     total [####################..............................] 40.00%
this epoch [..................................................]  0.00%
      4800 iter, 8 epoch / 20 epochs
    245.02 iters/sec. Estimated time to finish: 0:00:29.385524.
[4A[J     total [####################..............................] 40.83%
this epoch [########..........................................] 16.67%
      4900 iter, 8 epoch / 20 epochs
    245.47 iters/sec. Estimated time to finish: 0:00:28.923795.
[4A[J     total [####################..............................] 41.67%
this epoch [################..................................] 33.33%
      5000 iter, 8 epoch / 20 epochs
    245.91 iters/sec. Estimated time to finish: 0:00:28.465973.
[4A[J     total [#####################.............................] 42.50%
this epoch [#########################.........................] 50.00%
      5100 iter, 8 epoch / 20 epochs
    246.47 iters/sec. Estimated time to finish: 0:00:27.994909.
[4A[J     total [#####################.............................] 43.33%
this epoch [#################################.................] 66.67%
      5200 iter, 8 epoch / 20 epochs
    246.95 iters/sec. Estimated time to finish: 0:00:27.535404.
[4A[J     total [######################............................] 44.17%
this epoch [#########################################.........] 83.33%
      5300 iter, 8 epoch / 20 epochs
    247.33 iters/sec. Estimated time to finish: 0:00:27.089584.
[4A[J9           0.931317       0.9341                    
[J     total [######################............................] 45.00%
this epoch [..................................................]  0.00%
      5400 iter, 9 epoch / 20 epochs
    245.58 iters/sec. Estimated time to finish: 0:00:26.874639.
[4A[J     total [######################............................] 45.83%
this epoch [########..........................................] 16.67%
      5500 iter, 9 epoch / 20 epochs
    245.87 iters/sec. Estimated time to finish: 0:00:26.437190.
[4A[J     total [#######################...........................] 46.67%
this epoch [################..................................] 33.33%
      5600 iter, 9 epoch / 20 epochs
    246.33 iters/sec. Estimated time to finish: 0:00:25.981189.
[4A[J     total [#######################...........................] 47.50%
this epoch [#########################.........................] 50.00%
      5700 iter, 9 epoch / 20 epochs
    246.78 iters/sec. Estimated time to finish: 0:00:25.528408.
[4A[J     total [########################..........................] 48.33%
this epoch [#################################.................] 66.67%
      5800 iter, 9 epoch / 20 epochs
     247.2 iters/sec. Estimated time to finish: 0:00:25.080847.
[4A[J     total [########################..........................] 49.17%
this epoch [#########################################.........] 83.33%
      5900 iter, 9 epoch / 20 epochs
    247.69 iters/sec. Estimated time to finish: 0:00:24.627826.
[4A[J10          0.934733       0.9369                    
[J     total [#########################.........................] 50.00%
this epoch [..................................................]  0.00%
      6000 iter, 10 epoch / 20 epochs
    246.59 iters/sec. Estimated time to finish: 0:00:24.332159.
[4A[J     total [#########################.........................] 50.83%
this epoch [########..........................................] 16.67%
      6100 iter, 10 epoch / 20 epochs
       247 iters/sec. Estimated time to finish: 0:00:23.886641.
[4A[J     total [#########################.........................] 51.67%
this epoch [################..................................] 33.33%
      6200 iter, 10 epoch / 20 epochs
    247.36 iters/sec. Estimated time to finish: 0:00:23.448076.
[4A[J     total [##########################........................] 52.50%
this epoch [#########################.........................] 50.00%
      6300 iter, 10 epoch / 20 epochs
    247.73 iters/sec. Estimated time to finish: 0:00:23.008541.
[4A[J     total [##########################........................] 53.33%
this epoch [#################################.................] 66.67%
      6400 iter, 10 epoch / 20 epochs
    248.16 iters/sec. Estimated time to finish: 0:00:22.566452.
[4A[J     total [###########################.......................] 54.17%
this epoch [#########################################.........] 83.33%
      6500 iter, 10 epoch / 20 epochs
    248.61 iters/sec. Estimated time to finish: 0:00:22.123234.
[4A[J11          0.937883       0.9414                    
[J     total [###########################.......................] 55.00%
this epoch [..................................................]  0.00%
      6600 iter, 11 epoch / 20 epochs
    247.52 iters/sec. Estimated time to finish: 0:00:21.816101.
[4A[J     total [###########################.......................] 55.83%
this epoch [########..........................................] 16.67%
      6700 iter, 11 epoch / 20 epochs
    247.67 iters/sec. Estimated time to finish: 0:00:21.399559.
[4A[J     total [############################......................] 56.67%
this epoch [################..................................] 33.33%
      6800 iter, 11 epoch / 20 epochs
    247.88 iters/sec. Estimated time to finish: 0:00:20.977519.
[4A[J     total [############################......................] 57.50%
this epoch [#########################.........................] 50.00%
      6900 iter, 11 epoch / 20 epochs
    248.13 iters/sec. Estimated time to finish: 0:00:20.553526.
[4A[J     total [#############################.....................] 58.33%
this epoch [#################################.................] 66.67%
      7000 iter, 11 epoch / 20 epochs
    248.28 iters/sec. Estimated time to finish: 0:00:20.138771.
[4A[J     total [#############################.....................] 59.17%
this epoch [#########################################.........] 83.33%
      7100 iter, 11 epoch / 20 epochs
    248.42 iters/sec. Estimated time to finish: 0:00:19.724508.
[4A[J12          0.940583       0.9438                    
[J     total [##############################....................] 60.00%
this epoch [..................................................]  0.00%
      7200 iter, 12 epoch / 20 epochs
    247.45 iters/sec. Estimated time to finish: 0:00:19.398094.
[4A[J     total [##############################....................] 60.83%
this epoch [########..........................................] 16.67%
      7300 iter, 12 epoch / 20 epochs
    247.79 iters/sec. Estimated time to finish: 0:00:18.967364.
[4A[J     total [##############################....................] 61.67%
this epoch [################..................................] 33.33%
      7400 iter, 12 epoch / 20 epochs
     248.1 iters/sec. Estimated time to finish: 0:00:18.540794.
[4A[J     total [###############################...................] 62.50%
this epoch [#########################.........................] 50.00%
      7500 iter, 12 epoch / 20 epochs
    248.46 iters/sec. Estimated time to finish: 0:00:18.111734.
[4A[J     total [###############################...................] 63.33%
this epoch [#################################.................] 66.67%
      7600 iter, 12 epoch / 20 epochs
    248.77 iters/sec. Estimated time to finish: 0:00:17.687175.
[4A[J     total [################################..................] 64.17%
this epoch [#########################################.........] 83.33%
      7700 iter, 12 epoch / 20 epochs
    249.07 iters/sec. Estimated time to finish: 0:00:17.264007.
[4A[J13          0.942633       0.9451                    
[J     total [################################..................] 65.00%
this epoch [..................................................]  0.00%
      7800 iter, 13 epoch / 20 epochs
    248.22 iters/sec. Estimated time to finish: 0:00:16.920387.
[4A[J     total [################################..................] 65.83%
this epoch [########..........................................] 16.67%
      7900 iter, 13 epoch / 20 epochs
    248.52 iters/sec. Estimated time to finish: 0:00:16.497482.
[4A[J     total [#################################.................] 66.67%
this epoch [################..................................] 33.33%
      8000 iter, 13 epoch / 20 epochs
    248.86 iters/sec. Estimated time to finish: 0:00:16.073042.
[4A[J     total [#################################.................] 67.50%
this epoch [#########################.........................] 50.00%
      8100 iter, 13 epoch / 20 epochs
     249.2 iters/sec. Estimated time to finish: 0:00:15.649976.
[4A[J     total [##################################................] 68.33%
this epoch [#################################.................] 66.67%
      8200 iter, 13 epoch / 20 epochs
    249.47 iters/sec. Estimated time to finish: 0:00:15.232395.
[4A[J     total [##################################................] 69.17%
this epoch [#########################################.........] 83.33%
      8300 iter, 13 epoch / 20 epochs
    249.72 iters/sec. Estimated time to finish: 0:00:14.816816.
[4A[J14          0.945083       0.9465                    
[J     total [###################################...............] 70.00%
this epoch [..................................................]  0.00%
      8400 iter, 14 epoch / 20 epochs
    248.89 iters/sec. Estimated time to finish: 0:00:14.463988.
[4A[J     total [###################################...............] 70.83%
this epoch [########..........................................] 16.67%
      8500 iter, 14 epoch / 20 epochs
    249.19 iters/sec. Estimated time to finish: 0:00:14.045501.
[4A[J     total [###################################...............] 71.67%
this epoch [################..................................] 33.33%
      8600 iter, 14 epoch / 20 epochs
    249.44 iters/sec. Estimated time to finish: 0:00:13.630462.
[4A[J     total [####################################..............] 72.50%
this epoch [#########################.........................] 50.00%
      8700 iter, 14 epoch / 20 epochs
    249.64 iters/sec. Estimated time to finish: 0:00:13.219213.
[4A[J     total [####################################..............] 73.33%
this epoch [#################################.................] 66.67%
      8800 iter, 14 epoch / 20 epochs
    249.92 iters/sec. Estimated time to finish: 0:00:12.804288.
[4A[J     total [#####################################.............] 74.17%
this epoch [#########################################.........] 83.33%
      8900 iter, 14 epoch / 20 epochs
    250.18 iters/sec. Estimated time to finish: 0:00:12.390956.
[4A[J15          0.947233       0.9495                    
[J     total [#####################################.............] 75.00%
this epoch [..................................................]  0.00%
      9000 iter, 15 epoch / 20 epochs
     249.4 iters/sec. Estimated time to finish: 0:00:12.028884.
[4A[J     total [#####################################.............] 75.83%
this epoch [########..........................................] 16.67%
      9100 iter, 15 epoch / 20 epochs
    249.64 iters/sec. Estimated time to finish: 0:00:11.616690.
[4A[J     total [######################################............] 76.67%
this epoch [################..................................] 33.33%
      9200 iter, 15 epoch / 20 epochs
    249.92 iters/sec. Estimated time to finish: 0:00:11.203418.
[4A[J     total [######################################............] 77.50%
this epoch [#########################.........................] 50.00%
      9300 iter, 15 epoch / 20 epochs
    250.17 iters/sec. Estimated time to finish: 0:00:10.792487.
[4A[J     total [#######################################...........] 78.33%
this epoch [#################################.................] 66.67%
      9400 iter, 15 epoch / 20 epochs
    250.43 iters/sec. Estimated time to finish: 0:00:10.382150.
[4A[J     total [#######################################...........] 79.17%
this epoch [#########################################.........] 83.33%
      9500 iter, 15 epoch / 20 epochs
    250.59 iters/sec. Estimated time to finish: 0:00:09.976316.
[4A[J16          0.949033       0.9496                    
[J     total [########################################..........] 80.00%
this epoch [..................................................]  0.00%
      9600 iter, 16 epoch / 20 epochs
    249.87 iters/sec. Estimated time to finish: 0:00:09.605143.
[4A[J     total [########################################..........] 80.83%
this epoch [########..........................................] 16.67%
      9700 iter, 16 epoch / 20 epochs
    250.05 iters/sec. Estimated time to finish: 0:00:09.197988.
[4A[J     total [########################################..........] 81.67%
this epoch [################..................................] 33.33%
      9800 iter, 16 epoch / 20 epochs
    250.32 iters/sec. Estimated time to finish: 0:00:08.788854.
[4A[J     total [#########################################.........] 82.50%
this epoch [#########################.........................] 50.00%
      9900 iter, 16 epoch / 20 epochs
    250.58 iters/sec. Estimated time to finish: 0:00:08.380646.
[4A[J     total [#########################################.........] 83.33%
this epoch [#################################.................] 66.67%
     10000 iter, 16 epoch / 20 epochs
    250.77 iters/sec. Estimated time to finish: 0:00:07.975449.
[4A[J     total [##########################################........] 84.17%
this epoch [#########################################.........] 83.33%
     10100 iter, 16 epoch / 20 epochs
    251.01 iters/sec. Estimated time to finish: 0:00:07.569486.
[4A[J17          0.9507         0.9526                    
[J     total [##########################################........] 85.00%
this epoch [..................................................]  0.00%
     10200 iter, 17 epoch / 20 epochs
    250.13 iters/sec. Estimated time to finish: 0:00:07.196375.
[4A[J     total [##########################################........] 85.83%
this epoch [########..........................................] 16.67%
     10300 iter, 17 epoch / 20 epochs
    250.15 iters/sec. Estimated time to finish: 0:00:06.795972.
[4A[J     total [###########################################.......] 86.67%
this epoch [################..................................] 33.33%
     10400 iter, 17 epoch / 20 epochs
    250.12 iters/sec. Estimated time to finish: 0:00:06.397005.
[4A[J     total [###########################################.......] 87.50%
this epoch [#########################.........................] 50.00%
     10500 iter, 17 epoch / 20 epochs
    250.15 iters/sec. Estimated time to finish: 0:00:05.996337.
[4A[J     total [############################################......] 88.33%
this epoch [#################################.................] 66.67%
     10600 iter, 17 epoch / 20 epochs
    251.26 iters/sec. Estimated time to finish: 0:00:05.571862.
[4A[J     total [############################################......] 89.17%
this epoch [#########################################.........] 83.33%
     10700 iter, 17 epoch / 20 epochs
    251.44 iters/sec. Estimated time to finish: 0:00:05.170228.
[4A[J18          0.952383       0.9532                    
[J     total [#############################################.....] 90.00%
this epoch [..................................................]  0.00%
     10800 iter, 18 epoch / 20 epochs
    250.63 iters/sec. Estimated time to finish: 0:00:04.787898.
[4A[J     total [#############################################.....] 90.83%
this epoch [########..........................................] 16.67%
     10900 iter, 18 epoch / 20 epochs
    250.76 iters/sec. Estimated time to finish: 0:00:04.386683.
[4A[J     total [#############################################.....] 91.67%
this epoch [################..................................] 33.33%
     11000 iter, 18 epoch / 20 epochs
     250.8 iters/sec. Estimated time to finish: 0:00:03.987294.
[4A[J     total [##############################################....] 92.50%
this epoch [#########################.........................] 50.00%
     11100 iter, 18 epoch / 20 epochs
    250.85 iters/sec. Estimated time to finish: 0:00:03.587843.
[4A[J     total [##############################################....] 93.33%
this epoch [#################################.................] 66.67%
     11200 iter, 18 epoch / 20 epochs
    251.83 iters/sec. Estimated time to finish: 0:00:03.176797.
[4A[J     total [###############################################...] 94.17%
this epoch [#########################################.........] 83.33%
     11300 iter, 18 epoch / 20 epochs
       252 iters/sec. Estimated time to finish: 0:00:02.777783.
[4A[J19          0.953817       0.953                     
[J     total [###############################################...] 95.00%
this epoch [..................................................]  0.00%
     11400 iter, 19 epoch / 20 epochs
    251.32 iters/sec. Estimated time to finish: 0:00:02.387425.
[4A[J     total [###############################################...] 95.83%
this epoch [########..........................................] 16.67%
     11500 iter, 19 epoch / 20 epochs
    251.59 iters/sec. Estimated time to finish: 0:00:01.987384.
[4A[J     total [################################################..] 96.67%
this epoch [################..................................] 33.33%
     11600 iter, 19 epoch / 20 epochs
    251.86 iters/sec. Estimated time to finish: 0:00:01.588182.
[4A[J     total [################################################..] 97.50%
this epoch [#########################.........................] 50.00%
     11700 iter, 19 epoch / 20 epochs
    252.12 iters/sec. Estimated time to finish: 0:00:01.189929.
[4A[J     total [#################################################.] 98.33%
this epoch [#################################.................] 66.67%
     11800 iter, 19 epoch / 20 epochs
    253.16 iters/sec. Estimated time to finish: 0:00:00.790023.
[4A[J     total [#################################################.] 99.17%
this epoch [#########################################.........] 83.33%
     11900 iter, 19 epoch / 20 epochs
     253.1 iters/sec. Estimated time to finish: 0:00:00.395094.
[4A[J20          0.95535        0.9551                    
[J     total [##################################################] 100.00%
this epoch [..................................................]  0.00%
     12000 iter, 20 epoch / 20 epochs
    252.37 iters/sec. Estimated time to finish: 0:00:00.
[4A[J

這些擴展執行以下任務:

  • Evaluator 在每個epoch 結束時基於測試數據集評估當前模型。它會自動切換到測試模式,因此我們不必爲在訓練/測試模式(例如,dropout(),BatchNormalization)中表現不同的模式採取任何特殊的功能。

  • LogReport 彙總要報告的數值並將其發送到輸出目錄中的日誌文件。

  • PrintReport 在LogReport中打印選定的項目。

  • ProgressBar 顯示進度條。

在chainer.training.extensions模塊中實現了許多擴展。其中最重要的一個就是snapshot(),它將訓練過程的快照(即Trainer對象)保存到輸出目錄中的一個文件中。

examples / mnist目錄中的示例代碼還包含GPU支持,儘管其基本部分與本教程中的代碼相同。我們將在後面的章節中回顧如何使用GPU。


轉自:https://bennix.github.io/blog/2017/12/14/chain_basic/

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章