Chainer 介紹
Posted by 徐志平 on December 14, 2017
Chainer 介紹
這裏是 Chainer 教程的第一部分。 在此部分中,您將學習如下內容:
- 現行框架的優缺點以及我們爲什麼開發 Chainer
- 前向以及反向計算的簡單的例子
- 連接的使用以及梯度計算
- chains 的構建(即. 大多數框架所指的“模型”)
- 參數優化
- 連接和優化器的串行化
讀完此部分,您將能夠:
- 計算一些算式的梯度
- 用 Chainer 寫一個多層感知器
核心概念
正如前文所述, Chainer 是一個柔性的神經網絡框架。我們的主要目標就是柔性,使得我們能夠簡單直觀的寫出複雜的網絡。
當下已有的深度學習框架使用的是“定義後運行”機制。即意味着,首先定義並且固化一個網絡,再周而復始地饋入小批量數據進行訓練。由於網絡是在任何前向、反向計算前靜態定義的,所有的邏輯作爲數據必須事先嵌入網絡中。 意味着,在諸如Caffe這樣的框架中通過聲明的方法定義網絡結構。(注:可以使用torch.nn, 基於 Theano框架, 以及 TensorFlow 的命令語句定義一個靜態網絡)
邊定義邊運行
Chainer 對應地採用了一種叫做 “邊定義邊運行” 的機制, 即, 網絡可以在實際進行前向計算的時候同時被定義。 更加準確的說, Chainer 存儲的是計算的歷史結果而不是計算邏輯。這個策略使我們能夠充分利用Python中編程邏輯的力量。例如,Chainer不需要任何魔法就可以將條件和循環引入到網絡定義中。 邊定義邊運行是Chainer的核心概念。 我們將在本教程中展示如何動態定義網絡。
這個策略也使編寫多GPU並行化變得容易,因爲邏輯更接近於網絡操作。我們將在本教程後面的章節中回顧這些設施。
Chainer 將網絡表示爲計算圖上的執行路徑。計算圖是一系列函數應用,因此它可以用多個Function
對象來描述。當這個Function
是一個神經網絡層時,功能的參數將通過訓練來更新。因此,該函數需要在內部保留可訓練的參數,因此Chainer具有Link類,它可以在類的對象上保存可訓練參數。在Link
對象中執行的函數的參數被表示爲Variable
對象。 簡言之,Link
和Function
之間的區別在於它是否包含可訓練參數。 神經網絡模型通常被描述爲一系列Link
和Function
。
您可以通過動態“鏈接”各種Link
和Function
來構建計算圖來定義Chain。在框架中,通過運行鏈接圖來定義網絡,因此名稱是Chainer。
在本教程的示例代碼中,我們假定爲了簡單起見,已經預先導入了以下語句:
import numpy as np
import chainer
from chainer import cuda, Function, gradient_check, report, training, utils, Variable
from chainer import datasets, iterators, optimizers, serializers
from chainer import Link, Chain, ChainList
import chainer.functions as F
import chainer.links as L
from chainer.training import extensions
這些導入廣泛出現在Chainer代碼和例子中。爲了簡單起見,我們在本教程中省略了這些導入。
前向/反向計算
如上所述,Chainer使用“邊定義邊運行”方案,因此前向計算本身即定義了網絡。爲了開始前向計算,我們必須將輸入數組設置爲一個Variable
對象。這裏我們從一個簡單的ndarray開始,只有一個元素:
x_data = np.array([5], dtype=np.float32)
x = Variable(x_data)
Variable
對象具有基本的算術運算符。爲了計算 y=x2−2x+1y=x2−2x+1, 只需寫:
y = x**2 - 2 * x + 1
結果y也是一個Variable
對象,其值可以通過訪問data
屬性來提取:
y.data
array([ 16.], dtype=float32)
y所持有的不僅是結果的數值。它也保持計算的歷史(即計算圖),其能夠計算其差分。這是通過調用它的backward()
方法完成的:
y.backward()
其運行錯誤反向傳播(也稱爲反向傳播或反向模式自動差分)。然後,計算梯度並將其存儲在輸入變量x的grad
屬性中:
x.grad
array([ 8.], dtype=float32)
我們也可以計算中間變量的梯度。請注意,Chainer默認情況下會釋放中間變量的梯度數組以提高內存效率。爲了保留梯度信息,請將retain_grad
參數傳遞給backward
方法:
z = 2*x
y = x**2 - z + 1
y.backward(retain_grad=True)
z.grad
array([-1.], dtype=float32)
否則,z.grad
將爲None
,如下所示:
z = 2*x
y = x**2 - z + 1
y.backward()
z.grad
z.grad is None
True
所有這些計算都很容易推廣到多元素數組輸入。請注意,如果我們想從一個包含多元素數組的變量開始向後計算,我們必須手動設置初始錯誤。 因爲當一個變量的size
(這意味着數組中元素的個數)是1時,它被認爲是一個表示損失值的變量對象,所以變量的grad
屬性被自動填充爲1。 另一方面,當一個變量的大小大於1時,grad
屬性保持爲None
,並且在運行backward()
之前需要明確地設置初始錯誤。這可以簡單地通過設置輸出變量的grad
屬性來完成,如下所示:
x = Variable(np.array([[1, 2, 3], [4, 5, 6]], dtype=np.float32))
y = x**2 - 2*x + 1
y.grad = np.ones((2, 3), dtype=np.float32)
y.backward()
x.grad
array([[ 0., 2., 4.],
[ 6., 8., 10.]], dtype=float32)
在
functions
模塊中定義了許多采用Variable
對象的函數。您可以將它們結合起來,實現具有自動後向計算的複雜功能.
連接
爲了編寫神經網絡,我們必須將函數與參數相結合,並優化參數。你可以使用連接來做到這一點。Link
是保存參數(即優化目標)的對象。
最基本的是像常規函數一樣的連接。我們將介紹更高層次的連接,但是在這裏將連接看作簡化的帶有參數的函數。
最經常使用的連接之一是Linear
連接(也稱爲完全連接層或仿射變換)。它代表一個數學函數 f(x)=Wx+bf(x)=Wx+b ,其中W
爲矩陣和b
爲矢量參數。這個連接對應於linear()
,它接受x
,W
,b
作爲參數。從三維空間到二維空間的線性連接由以下行定義:
f = L.Linear(3, 2)
大多數函數和鏈接只接受小批量輸入,其中輸入數組的第一個維度被視爲批量維度。在上面的線性連接情況下,輸入必須具有(N,3)的形狀,其中N是最小批量大小。
連接的參數被存儲爲屬性。每個參數都是Variable
的一個實例。在Linear
連接的情況下,存儲兩個參數W
和b
。默認情況下,矩陣W
是隨機初始化的,而向量b
是用零初始化的。
f.W.data
array([[ 0.19792122, 0.29951876, -0.31833425],
[-0.59501284, -0.65519476, -0.00605371]], dtype=float32)
f.b.data
array([ 0., 0.], dtype=float32)
Linear
連接的一個實例就像一個通常的函數:
x = Variable(np.array([[1, 2, 3], [4, 5, 6]], dtype=np.float32))
y = f(x)
y.data
array([[-0.15804404, -1.9235636 ],
[ 0.37927318, -5.69234705]], dtype=float32)
有時計算輸入空間的維數很麻煩。線性連接和一些(反)卷積連接可以在實例化時省略輸入維度,並從第一個小批量中推斷出輸入維度來。
例如,以下行創建一個輸出維度爲兩個的線性連接:
g = L.Linear(2)
如果我們輸入一個小批量的形狀爲
(N,M)
,則輸入維數將被推斷爲M
,這意味着g.W
將是2×M
矩陣。 請注意,它的參數在第一個小批處理中以懶惰的方式初始化。因此,如果沒有數據放入連接,則f
不具有W
屬性。
參數的梯度由backward()
方法計算。請注意,梯度是由方法累積而不是覆蓋。所以首先你必須清除梯度來更新計算。可以通過調用cleargrads()
方法來完成。
x = Variable(np.array([[1, 2, 3], [4, 5, 6]], dtype=np.float32))
g = L.Linear(2)
p=g(x)
p
variable([[-2.64461255, 2.90179563],
[-6.81166267, 4.94405651]])
g.cleargrads()
g.grad = np.ones((2, 2), dtype=np.float32)
g.W.grad
g.b.grad
基於 chain 寫一個模型
大多數神經網絡體系結構包含多個連接。例如,多層感知器由多個線性層組成。我們可以通過組合多個連接來編寫具有可訓練參數的複雜過程:
l1 = L.Linear(4, 3)
l2 = L.Linear(3, 2)
def my_forward(x):
h = l1(x)
return l2(h)
這裏的L表示links
模塊。以這種方式定義參數的過程很難重用。更多Pythonic的方式是將連接和程序組合成一個類:
class MyProc(object):
def __init__(self):
self.l1 = L.Linear(4, 3)
self.l2 = L.Linear(3, 2)
def forward(self, x):
h = self.l1(x)
return self.l2(h)
爲了使其更加可重用,我們希望支持參數管理,CPU / GPU遷移,強大而靈活的保存/加載功能等。這些功能都由Chainer中的Chain
類支持。那麼,我們要做的就是將上面的類定義爲 Chain
的子類:
class MyChain(Chain):
def __init__(self):
super(MyChain, self).__init__()
with self.init_scope():
self.l1 = L.Linear(4, 3)
self.l2 = L.Linear(3, 2)
def __call__(self, x):
h = self.l1(x)
return self.l2(h)
它顯示了一個複雜的連接是如何通過更連接的鏈接構建的。諸如l1
和l2
被稱爲MyChain的子連接。注意,Chain
本身繼承自Link
。這意味着我們可以定義更復雜的連接,將MyChain對象作爲子連接。
我們經常通過__call__運算符定義一個前向連接。這樣的連接和Chains是可調用的,並且像常規函數和變量一樣。
另一種定義chain的方法是使用ChainList
類,它的行爲類似於連接列表:
class MyChain2(ChainList):
def __init__(self):
super(MyChain2, self).__init__(
L.Linear(4, 3),
L.Linear(3, 2),
)
def __call__(self, x):
h = self[0](x)
return self[1](h)
ChainList
可以方便地使用任意數量的連接,但是如果連接的數量固定且與上述情況相同,則建議使用Chain
類作爲基類。
優化器
爲了獲得良好的參數值,我們必須通過優化器類來優化它們。它在給定的連接上運行數值優化算法。許多算法在優化器模塊中實現。這裏我們使用最簡單的稱爲隨機梯度下降(SGD):
model = MyChain()
optimizer = optimizers.SGD()
optimizer.setup(model)
setup()方法針對給定的連接準備對應的優化器。
一些參數/梯度操作,例如權重衰減和梯度剪切,可以通過設置鉤子函數到優化器來完成。 鉤子函數在梯度計算之後和實際更新參數之前調用。例如,我們可以通過預先運行下一行來設置權重衰減正則化:
optimizer.add_hook(chainer.optimizer.WeightDecay(0.0005))
當然,你可以編寫自己的鉤子函數。它應該是一個函數或一個可調用的對象,以優化器爲參數。
有兩種使用優化器的方法。一個是通過訓練器使用它,我們將在下面的部分中看到。另一種方式是直接使用它。我們在這裏回顧後一種情況。如果您有興趣以簡單的方式使用優化器,請跳過本節並轉到下一節。
還有兩種直接使用優化器的方法。一個是手動計算梯度,然後調用沒有參數的 update()
方法。不要忘記事先清除梯度!
x = np.random.uniform(-1, 1, (2, 4)).astype('f')
model.cleargrads()
# compute gradient here...
loss = F.sum(model(chainer.Variable(x)))
loss.backward()
optimizer.update()
另一種方法是將損失函數傳遞給update()
方法。在這種情況下,cleargrads()
會被update方法自動調用,所以用戶不必手動調用它。
def lossfun(arg1, arg2):
# calculate loss
loss = F.sum(model(arg1 - arg2))
return loss
arg1 = np.random.uniform(-1, 1, (2, 4)).astype('f')
arg2 = np.random.uniform(-1, 1, (2, 4)).astype('f')
optimizer.update(lossfun, chainer.Variable(arg1), chainer.Variable(arg2))
訓練器
當我們想要訓練神經網絡時,我們必須運行訓練循環多次更新參數。典型的訓練循環包括以下過程:
- 對訓練數據集進行迭代
- 提取小批量的預處理
- 神經網絡的前向/後向計算
- 參數更新
- 評估驗證數據集上的當前參數
- 記錄和打印中間結果
Chainer提供了一個簡單而強大的方法來使寫這樣的訓練過程變得容易。訓練循環抽象主要由兩部分組成:
-
數據集抽象。它在上面的列表中實現了1和2。核心組件在數據集模塊中定義。數據集和迭代器模塊中還有許多數據集和迭代器的實現。
-
訓練器。它在上面的列表中實現3,4,5和6。整個程序由Trainer執行。更新參數(3和4)的方式由
Updater
定義,可以自由定製。 5和6由Extension
的實例來實現,它將一個額外的過程附加到訓練循環中。用戶可以通過添加擴展來自由定製訓練程序。用戶也可以實現自己的擴展。
序列化器
在繼續第一個例子之前,我們介紹Serializer,這是本頁中描述的最後一個核心功能。序列化器是一個簡單的接口來序列化或反序列化一個對象。連接,優化器和訓練器都支持序列化。
序列化器模塊中定義了具體的序列化器。它支持NumPy NPZ和HDF5格式。
例如,我們可以通過serializers.save_npz()函數將連接對象序列化成NPZ文件:
serializers.save_npz('my.model', model)
它將模型的參數以NPZ格式保存到文件“my.model”中。保存的模型可以被serializers.load_npz()函數讀取:
serializers.load_npz('my.model', model)
請注意,只有參數和持久值由該序列化代碼序列化。其他屬性不會自動保存。您可以通過
Link.add_persistent()
方法將數組,標量或任何可序列化的對象註冊爲持久值。註冊的值可以通過傳遞給add_persistent
方法的名稱的屬性來訪問。
優化器的狀態也可以通過相同的函數來保存:
serializers.save_npz('my.state', optimizer)
serializers.load_npz('my.state', optimizer)
請注意,優化器的序列化只保存其內部狀態,包括迭代次數,MomentumSGD的動量向量等。它不保存目標連接的參數和永久值。我們必須明確地保存與優化器的目標連接,從保存狀態恢復優化。
如果安裝了h5py軟件包,則支持HDF5格式。 HDF5格式的序列化和反序列化與NPZ格式的序列化和反序列化幾乎相同;只需用save_hdf5()和load_hdf5()分別替換save_npz()和load_npz()即可。
例子:基於MNIST的多層感知器
現在,您可以使用多層感知器(MLP)來解決多類分類任務。我們使用手寫數字數據集稱爲MNIST,這是機器學習中長期使用的事實上的“hello world”示例之一。這個MNIST例子也可以在官方倉庫的examples / mnist目錄中找到。我們演示如何使用訓練器來構建和運行本節中的訓練循環。
我們首先必須準備MNIST數據集。 MNIST數據集由70,000個尺寸爲28×28(即784個像素)的灰度圖像和相應的數字標籤組成。數據集默認分爲6萬個訓練圖像和10,000個測試圖像。我們可以通過datasets.get_mnist()
獲得矢量化版本(即一組784維向量)。
train, test = datasets.get_mnist()
此代碼自動下載MNIST數據集並將NumPy數組保存到 $(HOME)/.chainer
目錄中。返回的訓練集和測試集可以看作圖像標籤配對的列表(嚴格地說,它們是TupleDataset的實例)。
我們還必須定義如何迭代這些數據集。我們想要在數據集的每次掃描開始時對每個epoch的訓練數據集進行重新洗牌。在這種情況下,我們可以使用iterators.SerialIterator
。
train_iter = iterators.SerialIterator(train, batch_size=100, shuffle=True)
另一方面,我們不必洗牌測試數據集。在這種情況下,我們可以通過shuffle = False來禁止混洗。當底層數據集支持快速切片時,它使迭代速度更快。
test_iter = iterators.SerialIterator(test, batch_size=100, repeat=False, shuffle=False)
當所有的例子被訪問時,我們停止迭代通過設定 repeat=False 。測試/驗證數據集通常需要此選項;沒有這個選項,迭代進入一個無限循環。
接下來,我們定義架構。我們使用一個簡單的三層網絡,每層100個單元。
class MLP(Chain):
def __init__(self, n_units, n_out):
super(MLP, self).__init__()
with self.init_scope():
# the size of the inputs to each layer will be inferred
self.l1 = L.Linear(None, n_units) # n_in -> n_units
self.l2 = L.Linear(None, n_units) # n_units -> n_units
self.l3 = L.Linear(None, n_out) # n_units -> n_out
def __call__(self, x):
h1 = F.relu(self.l1(x))
h2 = F.relu(self.l2(h1))
y = self.l3(h2)
return y
該鏈接使用relu()作爲激活函數。請注意,“l3”鏈接是最終的全連接層,其輸出對應於十個數字的分數。
爲了計算損失值或評估預測的準確性,我們在上面的MLP連接的基礎上定義一個分類器連接:
class Classifier(Chain):
def __init__(self, predictor):
super(Classifier, self).__init__()
with self.init_scope():
self.predictor = predictor
def __call__(self, x, t):
y = self.predictor(x)
loss = F.softmax_cross_entropy(y, t)
accuracy = F.accuracy(y, t)
report({'loss': loss, 'accuracy': accuracy}, self)
return loss
這個分類器類計算準確性和損失,並返回損失值。參數對x和t對應於數據集中的每個示例(圖像和標籤的元組)。 softmax_cross_entropy()
計算給定預測和基準真實標籤的損失值。 accuracy()
計算預測準確度。我們可以爲分類器的一個實例設置任意的預測器連接。
report()
函數向訓練器報告損失和準確度。收集訓練統計信息的具體機制參見 Reporter
. 您也可以採用類似的方式收集其他類型的觀測值,如激活統計。
請注意,類似上面的分類器的類被定義爲chainer.links.Classifier
。因此,我們將使用此預定義的Classifier
連接而不是使用上面的示例。
model = L.Classifier(MLP(100, 10)) # the input size, 784, is inferred
optimizer = optimizers.SGD()
optimizer.setup(model)
現在我們可以建立一個訓練器對象。
updater = training.StandardUpdater(train_iter, optimizer)
trainer = training.Trainer(updater, (20, 'epoch'), out='result')
第二個參數(20,’epoch’)表示訓練的持續時間。我們可以使用epoch或迭代作爲單位。在這種情況下,我們通過遍歷訓練集20次來訓練多層感知器。
爲了調用訓練循環,我們只需調用run()方法。
這個方法執行整個訓練序列。
上面的代碼只是優化了參數。在大多數情況下,我們想看看培訓的進展情況,我們可以在調用run方法之前使用擴展插入。
trainer.extend(extensions.Evaluator(test_iter, model))
trainer.extend(extensions.LogReport())
trainer.extend(extensions.PrintReport(['epoch', 'main/accuracy', 'validation/main/accuracy']))
trainer.extend(extensions.ProgressBar())
trainer.run()
epoch main/accuracy validation/main/accuracy
[J total [..................................................] 0.83%
this epoch [########..........................................] 16.67%
100 iter, 0 epoch / 20 epochs
inf iters/sec. Estimated time to finish: 0:00:00.
[4A[J total [..................................................] 1.67%
this epoch [################..................................] 33.33%
200 iter, 0 epoch / 20 epochs
270.19 iters/sec. Estimated time to finish: 0:00:43.672168.
[4A[J total [#.................................................] 2.50%
this epoch [#########################.........................] 50.00%
300 iter, 0 epoch / 20 epochs
271.99 iters/sec. Estimated time to finish: 0:00:43.017048.
[4A[J total [#.................................................] 3.33%
this epoch [#################################.................] 66.67%
400 iter, 0 epoch / 20 epochs
274.82 iters/sec. Estimated time to finish: 0:00:42.209075.
[4A[J total [##................................................] 4.17%
this epoch [#########################################.........] 83.33%
500 iter, 0 epoch / 20 epochs
275.19 iters/sec. Estimated time to finish: 0:00:41.789476.
[4A[J1 0.6581 0.8475
[J total [##................................................] 5.00%
this epoch [..................................................] 0.00%
600 iter, 1 epoch / 20 epochs
250.26 iters/sec. Estimated time to finish: 0:00:45.553447.
[4A[J total [##................................................] 5.83%
this epoch [########..........................................] 16.67%
700 iter, 1 epoch / 20 epochs
251.78 iters/sec. Estimated time to finish: 0:00:44.879872.
[4A[J total [###...............................................] 6.67%
this epoch [################..................................] 33.33%
800 iter, 1 epoch / 20 epochs
253.07 iters/sec. Estimated time to finish: 0:00:44.257362.
[4A[J total [###...............................................] 7.50%
this epoch [#########################.........................] 50.00%
900 iter, 1 epoch / 20 epochs
253.97 iters/sec. Estimated time to finish: 0:00:43.706513.
[4A[J total [####..............................................] 8.33%
this epoch [#################################.................] 66.67%
1000 iter, 1 epoch / 20 epochs
255.94 iters/sec. Estimated time to finish: 0:00:42.979372.
[4A[J total [####..............................................] 9.17%
this epoch [#########################################.........] 83.33%
1100 iter, 1 epoch / 20 epochs
257.61 iters/sec. Estimated time to finish: 0:00:42.311793.
[4A[J2 0.868483 0.8922
[J total [#####.............................................] 10.00%
this epoch [..................................................] 0.00%
1200 iter, 2 epoch / 20 epochs
250.02 iters/sec. Estimated time to finish: 0:00:43.196043.
[4A[J total [#####.............................................] 10.83%
this epoch [########..........................................] 16.67%
1300 iter, 2 epoch / 20 epochs
250.73 iters/sec. Estimated time to finish: 0:00:42.674737.
[4A[J total [#####.............................................] 11.67%
this epoch [################..................................] 33.33%
1400 iter, 2 epoch / 20 epochs
250.76 iters/sec. Estimated time to finish: 0:00:42.271780.
[4A[J total [######............................................] 12.50%
this epoch [#########################.........................] 50.00%
1500 iter, 2 epoch / 20 epochs
250.66 iters/sec. Estimated time to finish: 0:00:41.889907.
[4A[J total [######............................................] 13.33%
this epoch [#################################.................] 66.67%
1600 iter, 2 epoch / 20 epochs
250.63 iters/sec. Estimated time to finish: 0:00:41.494966.
[4A[J total [#######...........................................] 14.17%
this epoch [#########################################.........] 83.33%
1700 iter, 2 epoch / 20 epochs
250.3 iters/sec. Estimated time to finish: 0:00:41.150503.
[4A[J3 0.893583 0.9065
[J total [#######...........................................] 15.00%
this epoch [..................................................] 0.00%
1800 iter, 3 epoch / 20 epochs
245.03 iters/sec. Estimated time to finish: 0:00:41.627412.
[4A[J total [#######...........................................] 15.83%
this epoch [########..........................................] 16.67%
1900 iter, 3 epoch / 20 epochs
246.29 iters/sec. Estimated time to finish: 0:00:41.007745.
[4A[J total [########..........................................] 16.67%
this epoch [################..................................] 33.33%
2000 iter, 3 epoch / 20 epochs
246.63 iters/sec. Estimated time to finish: 0:00:40.547184.
[4A[J total [########..........................................] 17.50%
this epoch [#########################.........................] 50.00%
2100 iter, 3 epoch / 20 epochs
247.22 iters/sec. Estimated time to finish: 0:00:40.045529.
[4A[J total [#########.........................................] 18.33%
this epoch [#################################.................] 66.67%
2200 iter, 3 epoch / 20 epochs
248.21 iters/sec. Estimated time to finish: 0:00:39.482367.
[4A[J total [#########.........................................] 19.17%
this epoch [#########################################.........] 83.33%
2300 iter, 3 epoch / 20 epochs
248.73 iters/sec. Estimated time to finish: 0:00:38.997955.
[4A[J4 0.90485 0.9154
[J total [##########........................................] 20.00%
this epoch [..................................................] 0.00%
2400 iter, 4 epoch / 20 epochs
244.21 iters/sec. Estimated time to finish: 0:00:39.309754.
[4A[J total [##########........................................] 20.83%
this epoch [########..........................................] 16.67%
2500 iter, 4 epoch / 20 epochs
244.55 iters/sec. Estimated time to finish: 0:00:38.847329.
[4A[J total [##########........................................] 21.67%
this epoch [################..................................] 33.33%
2600 iter, 4 epoch / 20 epochs
245.78 iters/sec. Estimated time to finish: 0:00:38.245938.
[4A[J total [###########.......................................] 22.50%
this epoch [#########################.........................] 50.00%
2700 iter, 4 epoch / 20 epochs
246.89 iters/sec. Estimated time to finish: 0:00:37.668330.
[4A[J total [###########.......................................] 23.33%
this epoch [#################################.................] 66.67%
2800 iter, 4 epoch / 20 epochs
247.85 iters/sec. Estimated time to finish: 0:00:37.119132.
[4A[J total [############......................................] 24.17%
this epoch [#########################################.........] 83.33%
2900 iter, 4 epoch / 20 epochs
248.84 iters/sec. Estimated time to finish: 0:00:36.568961.
[4A[J5 0.9128 0.9222
[J total [############......................................] 25.00%
this epoch [..................................................] 0.00%
3000 iter, 5 epoch / 20 epochs
246.32 iters/sec. Estimated time to finish: 0:00:36.537719.
[4A[J total [############......................................] 25.83%
this epoch [########..........................................] 16.67%
3100 iter, 5 epoch / 20 epochs
247.27 iters/sec. Estimated time to finish: 0:00:35.993611.
[4A[J total [#############.....................................] 26.67%
this epoch [################..................................] 33.33%
3200 iter, 5 epoch / 20 epochs
247.64 iters/sec. Estimated time to finish: 0:00:35.535495.
[4A[J total [#############.....................................] 27.50%
this epoch [#########################.........................] 50.00%
3300 iter, 5 epoch / 20 epochs
248.02 iters/sec. Estimated time to finish: 0:00:35.078297.
[4A[J total [##############....................................] 28.33%
this epoch [#################################.................] 66.67%
3400 iter, 5 epoch / 20 epochs
248.3 iters/sec. Estimated time to finish: 0:00:34.635942.
[4A[J total [##############....................................] 29.17%
this epoch [#########################################.........] 83.33%
3500 iter, 5 epoch / 20 epochs
248.35 iters/sec. Estimated time to finish: 0:00:34.225545.
[4A[J6 0.9182 0.9251
[J total [###############...................................] 30.00%
this epoch [..................................................] 0.00%
3600 iter, 6 epoch / 20 epochs
245.49 iters/sec. Estimated time to finish: 0:00:34.217710.
[4A[J total [###############...................................] 30.83%
this epoch [########..........................................] 16.67%
3700 iter, 6 epoch / 20 epochs
245.88 iters/sec. Estimated time to finish: 0:00:33.755860.
[4A[J total [###############...................................] 31.67%
this epoch [################..................................] 33.33%
3800 iter, 6 epoch / 20 epochs
245.9 iters/sec. Estimated time to finish: 0:00:33.346716.
[4A[J total [################..................................] 32.50%
this epoch [#########################.........................] 50.00%
3900 iter, 6 epoch / 20 epochs
245.96 iters/sec. Estimated time to finish: 0:00:32.931534.
[4A[J total [################..................................] 33.33%
this epoch [#################################.................] 66.67%
4000 iter, 6 epoch / 20 epochs
245.99 iters/sec. Estimated time to finish: 0:00:32.521949.
[4A[J total [#################.................................] 34.17%
this epoch [#########################################.........] 83.33%
4100 iter, 6 epoch / 20 epochs
246.12 iters/sec. Estimated time to finish: 0:00:32.098613.
[4A[J7 0.923683 0.9281
[J total [#################.................................] 35.00%
this epoch [..................................................] 0.00%
4200 iter, 7 epoch / 20 epochs
244.37 iters/sec. Estimated time to finish: 0:00:31.918388.
[4A[J total [#################.................................] 35.83%
this epoch [########..........................................] 16.67%
4300 iter, 7 epoch / 20 epochs
244.24 iters/sec. Estimated time to finish: 0:00:31.526645.
[4A[J total [##################................................] 36.67%
this epoch [################..................................] 33.33%
4400 iter, 7 epoch / 20 epochs
244.7 iters/sec. Estimated time to finish: 0:00:31.058855.
[4A[J total [##################................................] 37.50%
this epoch [#########################.........................] 50.00%
4500 iter, 7 epoch / 20 epochs
245.22 iters/sec. Estimated time to finish: 0:00:30.584594.
[4A[J total [###################...............................] 38.33%
this epoch [#################################.................] 66.67%
4600 iter, 7 epoch / 20 epochs
245.84 iters/sec. Estimated time to finish: 0:00:30.100470.
[4A[J total [###################...............................] 39.17%
this epoch [#########################################.........] 83.33%
4700 iter, 7 epoch / 20 epochs
246.3 iters/sec. Estimated time to finish: 0:00:29.638363.
[4A[J8 0.927233 0.9312
[J total [####################..............................] 40.00%
this epoch [..................................................] 0.00%
4800 iter, 8 epoch / 20 epochs
245.02 iters/sec. Estimated time to finish: 0:00:29.385524.
[4A[J total [####################..............................] 40.83%
this epoch [########..........................................] 16.67%
4900 iter, 8 epoch / 20 epochs
245.47 iters/sec. Estimated time to finish: 0:00:28.923795.
[4A[J total [####################..............................] 41.67%
this epoch [################..................................] 33.33%
5000 iter, 8 epoch / 20 epochs
245.91 iters/sec. Estimated time to finish: 0:00:28.465973.
[4A[J total [#####################.............................] 42.50%
this epoch [#########################.........................] 50.00%
5100 iter, 8 epoch / 20 epochs
246.47 iters/sec. Estimated time to finish: 0:00:27.994909.
[4A[J total [#####################.............................] 43.33%
this epoch [#################################.................] 66.67%
5200 iter, 8 epoch / 20 epochs
246.95 iters/sec. Estimated time to finish: 0:00:27.535404.
[4A[J total [######################............................] 44.17%
this epoch [#########################################.........] 83.33%
5300 iter, 8 epoch / 20 epochs
247.33 iters/sec. Estimated time to finish: 0:00:27.089584.
[4A[J9 0.931317 0.9341
[J total [######################............................] 45.00%
this epoch [..................................................] 0.00%
5400 iter, 9 epoch / 20 epochs
245.58 iters/sec. Estimated time to finish: 0:00:26.874639.
[4A[J total [######################............................] 45.83%
this epoch [########..........................................] 16.67%
5500 iter, 9 epoch / 20 epochs
245.87 iters/sec. Estimated time to finish: 0:00:26.437190.
[4A[J total [#######################...........................] 46.67%
this epoch [################..................................] 33.33%
5600 iter, 9 epoch / 20 epochs
246.33 iters/sec. Estimated time to finish: 0:00:25.981189.
[4A[J total [#######################...........................] 47.50%
this epoch [#########################.........................] 50.00%
5700 iter, 9 epoch / 20 epochs
246.78 iters/sec. Estimated time to finish: 0:00:25.528408.
[4A[J total [########################..........................] 48.33%
this epoch [#################################.................] 66.67%
5800 iter, 9 epoch / 20 epochs
247.2 iters/sec. Estimated time to finish: 0:00:25.080847.
[4A[J total [########################..........................] 49.17%
this epoch [#########################################.........] 83.33%
5900 iter, 9 epoch / 20 epochs
247.69 iters/sec. Estimated time to finish: 0:00:24.627826.
[4A[J10 0.934733 0.9369
[J total [#########################.........................] 50.00%
this epoch [..................................................] 0.00%
6000 iter, 10 epoch / 20 epochs
246.59 iters/sec. Estimated time to finish: 0:00:24.332159.
[4A[J total [#########################.........................] 50.83%
this epoch [########..........................................] 16.67%
6100 iter, 10 epoch / 20 epochs
247 iters/sec. Estimated time to finish: 0:00:23.886641.
[4A[J total [#########################.........................] 51.67%
this epoch [################..................................] 33.33%
6200 iter, 10 epoch / 20 epochs
247.36 iters/sec. Estimated time to finish: 0:00:23.448076.
[4A[J total [##########################........................] 52.50%
this epoch [#########################.........................] 50.00%
6300 iter, 10 epoch / 20 epochs
247.73 iters/sec. Estimated time to finish: 0:00:23.008541.
[4A[J total [##########################........................] 53.33%
this epoch [#################################.................] 66.67%
6400 iter, 10 epoch / 20 epochs
248.16 iters/sec. Estimated time to finish: 0:00:22.566452.
[4A[J total [###########################.......................] 54.17%
this epoch [#########################################.........] 83.33%
6500 iter, 10 epoch / 20 epochs
248.61 iters/sec. Estimated time to finish: 0:00:22.123234.
[4A[J11 0.937883 0.9414
[J total [###########################.......................] 55.00%
this epoch [..................................................] 0.00%
6600 iter, 11 epoch / 20 epochs
247.52 iters/sec. Estimated time to finish: 0:00:21.816101.
[4A[J total [###########################.......................] 55.83%
this epoch [########..........................................] 16.67%
6700 iter, 11 epoch / 20 epochs
247.67 iters/sec. Estimated time to finish: 0:00:21.399559.
[4A[J total [############################......................] 56.67%
this epoch [################..................................] 33.33%
6800 iter, 11 epoch / 20 epochs
247.88 iters/sec. Estimated time to finish: 0:00:20.977519.
[4A[J total [############################......................] 57.50%
this epoch [#########################.........................] 50.00%
6900 iter, 11 epoch / 20 epochs
248.13 iters/sec. Estimated time to finish: 0:00:20.553526.
[4A[J total [#############################.....................] 58.33%
this epoch [#################################.................] 66.67%
7000 iter, 11 epoch / 20 epochs
248.28 iters/sec. Estimated time to finish: 0:00:20.138771.
[4A[J total [#############################.....................] 59.17%
this epoch [#########################################.........] 83.33%
7100 iter, 11 epoch / 20 epochs
248.42 iters/sec. Estimated time to finish: 0:00:19.724508.
[4A[J12 0.940583 0.9438
[J total [##############################....................] 60.00%
this epoch [..................................................] 0.00%
7200 iter, 12 epoch / 20 epochs
247.45 iters/sec. Estimated time to finish: 0:00:19.398094.
[4A[J total [##############################....................] 60.83%
this epoch [########..........................................] 16.67%
7300 iter, 12 epoch / 20 epochs
247.79 iters/sec. Estimated time to finish: 0:00:18.967364.
[4A[J total [##############################....................] 61.67%
this epoch [################..................................] 33.33%
7400 iter, 12 epoch / 20 epochs
248.1 iters/sec. Estimated time to finish: 0:00:18.540794.
[4A[J total [###############################...................] 62.50%
this epoch [#########################.........................] 50.00%
7500 iter, 12 epoch / 20 epochs
248.46 iters/sec. Estimated time to finish: 0:00:18.111734.
[4A[J total [###############################...................] 63.33%
this epoch [#################################.................] 66.67%
7600 iter, 12 epoch / 20 epochs
248.77 iters/sec. Estimated time to finish: 0:00:17.687175.
[4A[J total [################################..................] 64.17%
this epoch [#########################################.........] 83.33%
7700 iter, 12 epoch / 20 epochs
249.07 iters/sec. Estimated time to finish: 0:00:17.264007.
[4A[J13 0.942633 0.9451
[J total [################################..................] 65.00%
this epoch [..................................................] 0.00%
7800 iter, 13 epoch / 20 epochs
248.22 iters/sec. Estimated time to finish: 0:00:16.920387.
[4A[J total [################################..................] 65.83%
this epoch [########..........................................] 16.67%
7900 iter, 13 epoch / 20 epochs
248.52 iters/sec. Estimated time to finish: 0:00:16.497482.
[4A[J total [#################################.................] 66.67%
this epoch [################..................................] 33.33%
8000 iter, 13 epoch / 20 epochs
248.86 iters/sec. Estimated time to finish: 0:00:16.073042.
[4A[J total [#################################.................] 67.50%
this epoch [#########################.........................] 50.00%
8100 iter, 13 epoch / 20 epochs
249.2 iters/sec. Estimated time to finish: 0:00:15.649976.
[4A[J total [##################################................] 68.33%
this epoch [#################################.................] 66.67%
8200 iter, 13 epoch / 20 epochs
249.47 iters/sec. Estimated time to finish: 0:00:15.232395.
[4A[J total [##################################................] 69.17%
this epoch [#########################################.........] 83.33%
8300 iter, 13 epoch / 20 epochs
249.72 iters/sec. Estimated time to finish: 0:00:14.816816.
[4A[J14 0.945083 0.9465
[J total [###################################...............] 70.00%
this epoch [..................................................] 0.00%
8400 iter, 14 epoch / 20 epochs
248.89 iters/sec. Estimated time to finish: 0:00:14.463988.
[4A[J total [###################################...............] 70.83%
this epoch [########..........................................] 16.67%
8500 iter, 14 epoch / 20 epochs
249.19 iters/sec. Estimated time to finish: 0:00:14.045501.
[4A[J total [###################################...............] 71.67%
this epoch [################..................................] 33.33%
8600 iter, 14 epoch / 20 epochs
249.44 iters/sec. Estimated time to finish: 0:00:13.630462.
[4A[J total [####################################..............] 72.50%
this epoch [#########################.........................] 50.00%
8700 iter, 14 epoch / 20 epochs
249.64 iters/sec. Estimated time to finish: 0:00:13.219213.
[4A[J total [####################################..............] 73.33%
this epoch [#################################.................] 66.67%
8800 iter, 14 epoch / 20 epochs
249.92 iters/sec. Estimated time to finish: 0:00:12.804288.
[4A[J total [#####################################.............] 74.17%
this epoch [#########################################.........] 83.33%
8900 iter, 14 epoch / 20 epochs
250.18 iters/sec. Estimated time to finish: 0:00:12.390956.
[4A[J15 0.947233 0.9495
[J total [#####################################.............] 75.00%
this epoch [..................................................] 0.00%
9000 iter, 15 epoch / 20 epochs
249.4 iters/sec. Estimated time to finish: 0:00:12.028884.
[4A[J total [#####################################.............] 75.83%
this epoch [########..........................................] 16.67%
9100 iter, 15 epoch / 20 epochs
249.64 iters/sec. Estimated time to finish: 0:00:11.616690.
[4A[J total [######################################............] 76.67%
this epoch [################..................................] 33.33%
9200 iter, 15 epoch / 20 epochs
249.92 iters/sec. Estimated time to finish: 0:00:11.203418.
[4A[J total [######################################............] 77.50%
this epoch [#########################.........................] 50.00%
9300 iter, 15 epoch / 20 epochs
250.17 iters/sec. Estimated time to finish: 0:00:10.792487.
[4A[J total [#######################################...........] 78.33%
this epoch [#################################.................] 66.67%
9400 iter, 15 epoch / 20 epochs
250.43 iters/sec. Estimated time to finish: 0:00:10.382150.
[4A[J total [#######################################...........] 79.17%
this epoch [#########################################.........] 83.33%
9500 iter, 15 epoch / 20 epochs
250.59 iters/sec. Estimated time to finish: 0:00:09.976316.
[4A[J16 0.949033 0.9496
[J total [########################################..........] 80.00%
this epoch [..................................................] 0.00%
9600 iter, 16 epoch / 20 epochs
249.87 iters/sec. Estimated time to finish: 0:00:09.605143.
[4A[J total [########################################..........] 80.83%
this epoch [########..........................................] 16.67%
9700 iter, 16 epoch / 20 epochs
250.05 iters/sec. Estimated time to finish: 0:00:09.197988.
[4A[J total [########################################..........] 81.67%
this epoch [################..................................] 33.33%
9800 iter, 16 epoch / 20 epochs
250.32 iters/sec. Estimated time to finish: 0:00:08.788854.
[4A[J total [#########################################.........] 82.50%
this epoch [#########################.........................] 50.00%
9900 iter, 16 epoch / 20 epochs
250.58 iters/sec. Estimated time to finish: 0:00:08.380646.
[4A[J total [#########################################.........] 83.33%
this epoch [#################################.................] 66.67%
10000 iter, 16 epoch / 20 epochs
250.77 iters/sec. Estimated time to finish: 0:00:07.975449.
[4A[J total [##########################################........] 84.17%
this epoch [#########################################.........] 83.33%
10100 iter, 16 epoch / 20 epochs
251.01 iters/sec. Estimated time to finish: 0:00:07.569486.
[4A[J17 0.9507 0.9526
[J total [##########################################........] 85.00%
this epoch [..................................................] 0.00%
10200 iter, 17 epoch / 20 epochs
250.13 iters/sec. Estimated time to finish: 0:00:07.196375.
[4A[J total [##########################################........] 85.83%
this epoch [########..........................................] 16.67%
10300 iter, 17 epoch / 20 epochs
250.15 iters/sec. Estimated time to finish: 0:00:06.795972.
[4A[J total [###########################################.......] 86.67%
this epoch [################..................................] 33.33%
10400 iter, 17 epoch / 20 epochs
250.12 iters/sec. Estimated time to finish: 0:00:06.397005.
[4A[J total [###########################################.......] 87.50%
this epoch [#########################.........................] 50.00%
10500 iter, 17 epoch / 20 epochs
250.15 iters/sec. Estimated time to finish: 0:00:05.996337.
[4A[J total [############################################......] 88.33%
this epoch [#################################.................] 66.67%
10600 iter, 17 epoch / 20 epochs
251.26 iters/sec. Estimated time to finish: 0:00:05.571862.
[4A[J total [############################################......] 89.17%
this epoch [#########################################.........] 83.33%
10700 iter, 17 epoch / 20 epochs
251.44 iters/sec. Estimated time to finish: 0:00:05.170228.
[4A[J18 0.952383 0.9532
[J total [#############################################.....] 90.00%
this epoch [..................................................] 0.00%
10800 iter, 18 epoch / 20 epochs
250.63 iters/sec. Estimated time to finish: 0:00:04.787898.
[4A[J total [#############################################.....] 90.83%
this epoch [########..........................................] 16.67%
10900 iter, 18 epoch / 20 epochs
250.76 iters/sec. Estimated time to finish: 0:00:04.386683.
[4A[J total [#############################################.....] 91.67%
this epoch [################..................................] 33.33%
11000 iter, 18 epoch / 20 epochs
250.8 iters/sec. Estimated time to finish: 0:00:03.987294.
[4A[J total [##############################################....] 92.50%
this epoch [#########################.........................] 50.00%
11100 iter, 18 epoch / 20 epochs
250.85 iters/sec. Estimated time to finish: 0:00:03.587843.
[4A[J total [##############################################....] 93.33%
this epoch [#################################.................] 66.67%
11200 iter, 18 epoch / 20 epochs
251.83 iters/sec. Estimated time to finish: 0:00:03.176797.
[4A[J total [###############################################...] 94.17%
this epoch [#########################################.........] 83.33%
11300 iter, 18 epoch / 20 epochs
252 iters/sec. Estimated time to finish: 0:00:02.777783.
[4A[J19 0.953817 0.953
[J total [###############################################...] 95.00%
this epoch [..................................................] 0.00%
11400 iter, 19 epoch / 20 epochs
251.32 iters/sec. Estimated time to finish: 0:00:02.387425.
[4A[J total [###############################################...] 95.83%
this epoch [########..........................................] 16.67%
11500 iter, 19 epoch / 20 epochs
251.59 iters/sec. Estimated time to finish: 0:00:01.987384.
[4A[J total [################################################..] 96.67%
this epoch [################..................................] 33.33%
11600 iter, 19 epoch / 20 epochs
251.86 iters/sec. Estimated time to finish: 0:00:01.588182.
[4A[J total [################################################..] 97.50%
this epoch [#########################.........................] 50.00%
11700 iter, 19 epoch / 20 epochs
252.12 iters/sec. Estimated time to finish: 0:00:01.189929.
[4A[J total [#################################################.] 98.33%
this epoch [#################################.................] 66.67%
11800 iter, 19 epoch / 20 epochs
253.16 iters/sec. Estimated time to finish: 0:00:00.790023.
[4A[J total [#################################################.] 99.17%
this epoch [#########################################.........] 83.33%
11900 iter, 19 epoch / 20 epochs
253.1 iters/sec. Estimated time to finish: 0:00:00.395094.
[4A[J20 0.95535 0.9551
[J total [##################################################] 100.00%
this epoch [..................................................] 0.00%
12000 iter, 20 epoch / 20 epochs
252.37 iters/sec. Estimated time to finish: 0:00:00.
[4A[J
這些擴展執行以下任務:
-
Evaluator 在每個epoch 結束時基於測試數據集評估當前模型。它會自動切換到測試模式,因此我們不必爲在訓練/測試模式(例如,dropout(),BatchNormalization)中表現不同的模式採取任何特殊的功能。
-
LogReport 彙總要報告的數值並將其發送到輸出目錄中的日誌文件。
-
PrintReport 在LogReport中打印選定的項目。
-
ProgressBar 顯示進度條。
在chainer.training.extensions模塊中實現了許多擴展。其中最重要的一個就是snapshot(),它將訓練過程的快照(即Trainer對象)保存到輸出目錄中的一個文件中。
examples / mnist目錄中的示例代碼還包含GPU支持,儘管其基本部分與本教程中的代碼相同。我們將在後面的章節中回顧如何使用GPU。