Theano學習筆記(五)——配置設置與編譯模型

配置

config模塊包含了各種用於修改Theano的屬性。在Theano導入時,許多屬性都會被檢查,而有些屬性是只讀模式

一般約定,在用戶代碼內部config模塊的屬性不應當被修改。

Theano的這些屬性都有默認值,但是你也可以在你的.theanorc文件裏面修改,並且使用THEANO_FLAGS的環境變量進行修改。

優先順序是:

1. theano.config.<property>的賦值

2. THEANO_FLAGS的賦值

3..theanorc(或者在THEANORC文件中表示)的賦值

 

通過打印theano.config可以展示當前的配置:

python-c 'import theano; print theano.config' | less


例如,修改筆記(二)中的邏輯迴歸函數,設置精度爲float32

 

#!/usr/bin/envpython
#Theano tutorial
#Solution to Exercise in section 'Configuration Settings and Compiling Modes'
 
importnumpy
importtheano
importtheano.tensor as tt
 
theano.config.floatX= 'float32'
 
rng= numpy.random
 
N= 400
feats= 784
D= (rng.randn(N, feats).astype(theano.config.floatX),
rng.randint(size=N,low=0, high=2).astype(theano.config.floatX))
training_steps= 10000
 
#Declare Theano symbolic variables
x= tt.matrix("x")
y= tt.vector("y")
w= theano.shared(rng.randn(feats).astype(theano.config.floatX),name="w")
b= theano.shared(numpy.asarray(0., dtype=theano.config.floatX),name="b")
x.tag.test_value= D[0]
y.tag.test_value= D[1]
#print"Initial model:"
#printw.get_value(), b.get_value()
 
#Construct Theano expression graph
p_1= 1 / (1 + tt.exp(-tt.dot(x, w) - b))  #Probability of having a one
prediction= p_1 > 0.5  # The prediction that isdone: 0 or 1
xent= -y * tt.log(p_1) - (1 - y) * tt.log(1 - p_1) # Cross-entropy
cost= tt.cast(xent.mean(), 'float32') + \
       0.01 * (w ** 2).sum()  # The cost to optimize
gw,gb = tt.grad(cost, [w, b])
 
#Compile expressions to functions
train= theano.function(
            inputs=[x, y],
            outputs=[prediction, xent],
            updates={w: w - 0.01 * gw, b: b -0.01 * gb},
            name="train")
predict= theano.function(inputs=[x], outputs=prediction,
            name="predict")
 
ifany([x.op.__class__.__name__ in ['Gemv', 'CGemv', 'Gemm', 'CGemm'] for x in
train.maker.fgraph.toposort()]):
    print 'Used the cpu'
elifany([x.op.__class__.__name__ in ['GpuGemm', 'GpuGemv'] for x in
train.maker.fgraph.toposort()]):
    print 'Used the gpu'
else:
    print 'ERROR, not able to tell if theanoused the cpu or the gpu'
    print train.maker.fgraph.toposort()
 
fori in range(training_steps):
    pred, err = train(D[0], D[1])
#print"Final model:"
#printw.get_value(), b.get_value()
 
print"target values for D"
printD[1]
 
print"prediction on D"
printpredict(D[0])

 

用time python file.py運行,可得:

real  0m15.055s
user 0m11.527s
sys   0m0.801s

Mode

每次調用theano.function時,Theano變量輸入和輸出的符號化關係都被優化和編譯了。

而這些編輯都通過made參數的值來控制。

Theano定義以下mode:

 

FAST_COMPILE:

compile.mode.Mode(linker='py',optimizer='fast_compile')

只應用少量的圖優化並且只使用Python實現。

 

FAST_RUN:

compile.mode.Mode(linker='cvm',optimizer='fast_run')

使用所有的優化並且在可能的情況下使用C實現。

 

DebugMode:

compile.debugmode.DebugMode()

檢查所有優化的正確性,並且比較C與Python實現。這種模式比別的模式耗時都長,但是可以識別出各種問題。

 

ProfileMode(不贊成使用):

compile.profilemode.ProfileMode()

與FAST_RUN相同的優化,但是打印出一些設置信息。

 

默認的模式是FAST_RUN,但是通過傳遞關鍵字參數給theano.function,可以控制config.mode,從而改變模式。

 

Linkers

一個mode由2個部分組成:1個優化器和1個Linker。


[1]   gc指計算中間過程的碎片收集。否則在Theano函數調用之間,操作所使用的內存空間將被保存起來。爲了不重新分配內存,降低開銷(overhead),使其速度更快。

[2]   默認linker

[3]   不推薦使用

 

使用DebugMode

一般你應當使用FAST_RUN 或者FAST_COMPILE模式,當你定義新的類型的表達式或者優化方法時,先用DebugMode(mode='DebugMode')運行是很有用的,DebugMode通過運行一些自檢和判斷程序來幫助診斷出將會導致錯誤輸出的可能的編程錯誤。值得注意的是,DebugMode比FAST_RUN或者 FAST_COMPILE模式要慢得多,所以只在開發期使用。

 

舉個例子:

import theano
importtheano.tensor as T
x= T.dvector('x')
f= theano.function([x], 10 * x, mode='DebugMode')
f([5])
f([0])
f([7])

運行後,如果有問題,輸出會提示異常,如果依然不能解決,請聯繫本領域的專家。

但是DebugMode也不是萬能的,因爲有些錯誤只在特定的輸入條件下才會出現。

如果你使用構造器而不是關鍵詞DebugMode,就可以通過配構造器變量來配置。而關鍵詞設置太嚴格了。

 

ProfileMode不推薦使用

 

檢索時間信息

圖編譯好之後,運行就可以了。然後調用profmode.print_summary(),返回各自時間信息,例如你的圖大多數時間花在什麼地方了等等。

 

還是以邏輯迴歸爲例

生成ProfileMode實例

fromtheano import ProfileMode
profmode= theano.ProfileMode(optimizer='fast_run', linker=theano.gof.OpWiseCLinker())


在函數末尾聲明一下

train = theano.function(
           inputs=[x,y],
           outputs=[prediction,xent],
           updates={w:w - 0.01 * gw, b: b - 0.01 * gb},
           name="train",mode=profmode)
#如果是Module則這樣聲明:
# m = theano.Module()
# minst = m.make(mode=profmode)


取回時間信息

文件末尾添加

profmode.print_summary()


則運行效果是這樣的

ProfileMode.print_summary()
---------------------------
 
Timesince import 6.183s
Theanocompile time: 0.000s (0.0% since import)
    Optimization time: 0.000s
    Linker time: 0.000s
Theanofct call 5.452s (88.2% since import)
   Theano Op time 5.003s 80.9%(since import)91.8%(of fct call)
   Theano function overhead in ProfileMode0.449s 7.3%(since import) 8.2%(of fct call)
10000Theano fct call, 0.001s per call
Restof the time since import 0.730s 11.8%
 
Theanofct summary:
<%total fct time> <total time> <time per call> <nb call><fct name>
   100.0% 5.452s 5.45e-04s 10000 train
 
SingleOp-wise summary:
<%of local_time spent on this kind of Op> <cumulative %> <selfseconds> <cumulative seconds> <time per call> [*]<nb_call> <nb_op> <nb_apply> <Op name>
   87.9%  87.9%  4.400s  4.400s 2.20e-04s * 20000  1  2 <class 'theano.tensor.blas_c.CGemv'>
   10.8%  98.8%  0.542s  4.942s 5.42e-06s * 100000 10 10 <class 'theano.tensor.elemwise.Elemwise'>
    0.5%  99.3%  0.023s  4.966s 1.17e-06s * 20000  1  2 <class 'theano.tensor.basic.Alloc'>
    0.4%  99.6%  0.018s  4.984s 6.05e-07s * 30000  2  3 <class'theano.tensor.elemwise.DimShuffle'>
    0.3%  99.9%  0.013s  4.997s 1.25e-06s * 10000  1  1 <class 'theano.tensor.elemwise.Sum'>
    0.1% 100.0%  0.007s  5.003s 3.35e-07s * 20000  1  2 <class 'theano.compile.ops.Shape_i'>
   ... (remaining 0 single Op account for0.00%(0.00s) of the runtime)
(*)Op is running a c implementation
 
Op-wisesummary:
<%of local_time spent on this kind of Op> <cumulative %> <selfseconds> <cumulative seconds> <time per call> [*]  <nb_call> <nb apply> <Opname>
   87.9%  87.9%  4.400s  4.400s 2.20e-04s * 20000  2CGemv{inplace}
    6.3%  94.3%  0.318s  4.718s 3.18e-05s * 10000  1Elemwise{Composite{[Composite{[Composite{[sub(mul(i0, i1), neg(i2))]}(i0,scalar_softplus(i1), mul(i2, i3))]}(i0, i1, i2, scalar_softplus(i3))]}}
    2.1%  96.3%  0.103s  4.820s 1.03e-05s * 10000  1Elemwise{Composite{[Composite{[Composite{[Composite{[mul(i0, add(i1, i2))]}(i0,neg(i1), true_div(i2, i3))]}(i0, mul(i1, i2, i3), i4, i5)]}(i0, i1, i2,exp(i3), i4, i5)]}}[(0, 0)]
    1.6%  98.0%  0.082s  4.902s 8.16e-06s * 10000  1Elemwise{ScalarSigmoid{output_types_preference=transfer_type{0}}}[(0, 0)]
    0.5%  98.4%  0.023s  4.925s 1.17e-06s * 20000  2 Alloc
    0.3%  98.7%  0.013s  4.938s 1.25e-06s * 10000  1 Sum
    0.2%  98.9%  0.012s  4.950s 6.11e-07s * 20000  2InplaceDimShuffle{x}
    0.2%  99.1%  0.008s  4.959s 8.44e-07s * 10000  1Elemwise{gt,no_inplace}
    0.1%  99.2%  0.007s  4.965s 6.80e-07s * 10000  1Elemwise{sub,no_inplace}
    0.1%  99.4%  0.007s  4.972s 3.35e-07s * 20000  2 Shape_i{0}
    0.1%  99.5%  0.006s  4.978s 6.11e-07s * 10000  1Elemwise{Composite{[sub(neg(i0), i1)]}}[(0, 0)]
    0.1%  99.6%  0.006s  4.984s 5.93e-07s * 10000  1InplaceDimShuffle{1,0}
    0.1%  99.7%  0.005s  4.989s 5.33e-07s * 10000  1Elemwise{neg,no_inplace}
    0.1%  99.8%  0.005s  4.994s 4.85e-07s * 10000  1Elemwise{Cast{float32}}
    0.1%  99.9%  0.005s  4.999s 4.60e-07s * 10000  1Elemwise{inv,no_inplace}
    0.1% 100.0%  0.004s  5.003s 4.25e-07s * 10000  1Elemwise{Composite{[sub(i0, mul(i1, i2))]}}[(0, 0)]
   ... (remaining 0 Op account for   0.00%(0.00s) of the runtime)
(*)Op is running a c implementation
 
Apply-wisesummary:
<%of local_time spent at this position> <cumulative %%> <applytime> <cumulative seconds> <time per call> [*] <nb_call><Apply position> <Apply Op name>
   54.7%  54.7%  2.737s  2.737s 2.74e-04s  * 10000  7 CGemv{inplace}(Alloc.0, TensorConstant{1.0}, x, w,TensorConstant{0.0})
   33.2%  87.9%  1.663s  4.400s 1.66e-04s  * 10000 18 CGemv{inplace}(w, TensorConstant{-0.00999999977648}, x.T,Elemwise{Composite{[Composite{[Composite{[Composite{[mul(i0, add(i1, i2))]}(i0,neg(i1), true_div(i2, i3))]}(i0, mul(i1, i2, i3), i4, i5)]}(i0, i1, i2,exp(i3), i4, i5)]}}[(0, 0)].0, TensorConstant{0.999800026417})
    6.3%  94.3%  0.318s  4.718s 3.18e-05s  * 10000 13 Elemwise{Composite{[Composite{[Composite{[sub(mul(i0, i1),neg(i2))]}(i0, scalar_softplus(i1), mul(i2, i3))]}(i0, i1, i2,scalar_softplus(i3))]}}(y, Elemwise{Composite{[sub(neg(i0), i1)]}}[(0, 0)].0,Elemwise{sub,no_inplace}.0, Elemwise{neg,no_inplace}.0)
    2.1%  96.3%  0.103s  4.820s 1.03e-05s  * 10000 16 Elemwise{Composite{[Composite{[Composite{[Composite{[mul(i0, add(i1,i2))]}(i0, neg(i1), true_div(i2, i3))]}(i0, mul(i1, i2, i3), i4, i5)]}(i0, i1,i2, exp(i3), i4, i5)]}}[(0,0)](Elemwise{ScalarSigmoid{output_types_preference=transfer_type{0}}}[(0,0)].0, Alloc.0, y, Elemwise{Composite{[sub(neg(i0), i1)]}}[(0, 0)].0,Elemwise{sub,no_inplace}.0, Elemwise{Cast{float32}}.0)
    1.6%  98.0%  0.082s  4.902s 8.16e-06s  * 10000 14 Elemwise{ScalarSigmoid{output_types_preference=transfer_type{0}}}[(0,0)](Elemwise{neg,no_inplace}.0)
    0.3%  98.3%  0.015s  4.917s 1.53e-06s  * 10000 12 Alloc(Elemwise{inv,no_inplace}.0, Shape_i{0}.0)
    0.3%  98.5%  0.013s  4.930s 1.25e-06s  * 10000 17 Sum(Elemwise{Composite{[Composite{[Composite{[Composite{[mul(i0,add(i1, i2))]}(i0, neg(i1), true_div(i2, i3))]}(i0, mul(i1, i2, i3), i4,i5)]}(i0, i1, i2, exp(i3), i4, i5)]}}[(0, 0)].0)
    0.2%  98.7%  0.008s  4.938s 8.44e-07s  * 10000 15Elemwise{gt,no_inplace}(Elemwise{ScalarSigmoid{output_types_preference=transfer_type{0}}}[(0,0)].0, TensorConstant{(1,) of 0.5})
    0.2%  98.9%  0.008s  4.946s 8.14e-07s  * 10000  5 Alloc(TensorConstant{0.0}, Shape_i{0}.0)
    0.1%  99.0%  0.007s  4.953s 6.80e-07s  * 10000  4 Elemwise{sub,no_inplace}(TensorConstant{(1,) of 1.0}, y)
    0.1%  99.1%  0.006s  4.959s 6.16e-07s  * 10000  6 InplaceDimShuffle{x}(Shape_i{0}.0)
    0.1%  99.2%  0.006s  4.965s 6.11e-07s  * 10000  9 Elemwise{Composite{[sub(neg(i0), i1)]}}[(0, 0)](CGemv{inplace}.0,InplaceDimShuffle{x}.0)
    0.1%  99.4%  0.006s  4.972s 6.07e-07s  * 10000  0 InplaceDimShuffle{x}(b)
    0.1%  99.5%  0.006s  4.977s 5.93e-07s  * 10000  2 InplaceDimShuffle{1,0}(x)
    0.1%  99.6%  0.005s  4.983s 5.33e-07s  * 10000 11 Elemwise{neg,no_inplace}(Elemwise{Composite{[sub(neg(i0), i1)]}}[(0,0)].0)
   ... (remaining 5 Apply instances account for0.41%(0.02s) of the runtime)
(*)Op is running a c implementation

歡迎參與討論並關注本博客微博以及知乎個人主頁後續內容繼續更新哦~

轉載請您尊重作者的勞動,完整保留上述文字以及文章鏈接,謝謝您的支持!

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章