Theano學習筆記（四）——導數

原創

ycheng_sjtu

2018-09-03 23:09

導數使用T.grad計算。

這裏使用pp()打印梯度的符號表達式。

第3行輸出是打印了經過優化器簡化的符號梯度表達式，與第1個輸出相比確實簡單多了。

fill((x** TensorConstant{2}), TensorConstant{1.0})指創建一個x**2大小的矩陣，並填充1。

importtheano.tensor as T
fromtheano import pp
fromtheano import function
x= T.dscalar('x')
y= x ** 2
gy= T.grad(y, x)
printpp(gy)
f= function([x], gy)
printf(4)
printpp(f.maker.fgraph.outputs[0])
>>> 
((fill((x** TensorConstant{2}), TensorConstant{1.0}) * TensorConstant{2}) * (x **(TensorConstant{2} - TensorConstant{1})))
8.0
(TensorConstant{2.0}* x)

T.grad的第1個參數必須是標量

例如計算邏輯函數sigmoid的導數：

importtheano.tensor as T
fromtheano import function
x= T.dmatrix('x')
s= T.sum(1 / (1 + T.exp(-x)))
gs= T.grad(s, x)
dlogistic= function([x], gs)
printdlogistic([[0, 1], [-1, -2]])
>>> 
[[0.25        0.19661193]
 [ 0.19661193 0.10499359]]

計算雅克比（Jacobian）矩陣

雅克比矩陣是向量的一階偏導數：

用T.arrange生成從0到y.shape[0]的序列。循環計算。

scan可以提高創建符號循環效率。

lambda~是python內建的magicfunction.

x= T.dvector('x')
y = x ** 2
J, updates = theano.scan(lambdai, y,x : T.grad(y[i], x), sequences=T.arange(y.shape[0]), non_sequences=[y,x])
f = function([x], J,updates=updates)
f([4, 4])
>>> 
[[ 8.  0.]
 [ 0. 8.]]

計算海森（Hessian）矩陣

海森矩陣是多元函數的二階偏導數方陣。

只要用T.grad(cost,x)替換雅克比矩陣的一些y即可。

x= T.dvector('x')
y = x** 2
cost= y.sum()
gy =T.grad(cost, x)
H,updates = theano.scan(lambda i, gy,x : T.grad(gy[i], x),sequences=T.arange(gy.shape[0]), non_sequences=[gy, x])
f =function([x], H, updates=updates)
f([4,4])
>>> 
[[2.  0.]
 [ 0. 2.]]

雅克比右乘

x可以由向量擴展成矩陣。雅克比右乘使用Rop:

W = T.dmatrix('W')
V =T.dmatrix('V')
x =T.dvector('x')
y =T.dot(x, W)
JV =T.Rop(y, W, V)
f =theano.function([W, V, x], JV)
printf([[1, 1], [1, 1]], [[2, 2], [2, 2]], [0,1])
>>> 
[2.  2.]

雅克比左乘

雅克比左乘使用Lop:

import theano
import theano.tensor as T
from theano import function
x = T.dvector('x')
v =T.dvector('v')
x =T.dvector('x')
y =T.dot(x, W)
VJ =T.Lop(y, W, v)
f =theano.function([v,x], VJ)
print f([2, 2], [0, 1])
>>> 
[[0.  0.]
 [ 2. 2.]]

海森矩陣乘以向量

可以使用Rop

import theano
import theano.tensor as T
from theano import function
x= T.dvector('x')
v= T.dvector('v')
y= T.sum(x ** 2)
gy= T.grad(y, x)
Hv= T.Rop(gy, x, v)
f= theano.function([x, v], Hv)
printf([4, 4], [2, 2])
>>> 
[4.  4.]

歡迎參與討論並關注本博客和微博以及知乎個人主頁後續內容繼續更新哦~

轉載請您尊重作者的勞動，完整保留上述文字以及文章鏈接，謝謝您的支持！

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

Theano學習筆記（四）——導數

如何使用 JS 判斷用戶是否處於活躍狀態

lightdb秒級增加列和刪除列（not null帶默認值）

通過HPA+CronHPA組合應對業務複雜彈性伸縮場景

❤️‍🔥 Solon Cloud Event 新的事務特性與應用

Theano學習筆記（二）——邏輯迴歸函數解析

隱馬爾科夫模型（HMM）及其實現

Theano學習筆記（三）——圖結構

矢量化編程——以MNIST爲例

句法模式識別（二）-正規文法、上下文無關文法

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結