本文轉載自http://blog.csdn.net/u012436149/article/details/53905797

gradient

tensorflow中有一個計算梯度的函數tf.gradients(ys, xs)，要注意的是，xs中的x必須要與ys相關，不相關的話，會報錯。
代碼中定義了兩個變量w1， w2，但res只與w1相關

#wrong
import tensorflow as tf

w1 = tf.Variable([[1,2]])
w2 = tf.Variable([[3,4]])

res = tf.matmul(w1, [[2],[1]])

grads = tf.gradients(res,[w1,w2])

with tf.Session() as sess:
    tf.global_variables_initializer().run()
    re = sess.run(grads)
    print(re)

錯誤信息
TypeError: Fetch argument None has invalid type

# right
import tensorflow as tf

w1 = tf.Variable([[1,2]])
w2 = tf.Variable([[3,4]])

res = tf.matmul(w1, [[2],[1]])

grads = tf.gradients(res,[w1])

with tf.Session() as sess:
    tf.global_variables_initializer().run()
    re = sess.run(grads)
    print(re)
#  [array([[2, 1]], dtype=int32)]

對於grad_ys的測試：

import tensorflow as tf

w1 = tf.get_variable('w1', shape=[3])
w2 = tf.get_variable('w2', shape=[3])

w3 = tf.get_variable('w3', shape=[3])
w4 = tf.get_variable('w4', shape=[3])

z1 = w1 + w2+ w3
z2 = w3 + w4

grads = tf.gradients([z1, z2], [w1, w2, w3, w4], grad_ys=[tf.convert_to_tensor([2.,2.,3.]),
                                                          tf.convert_to_tensor([3.,2.,4.])])

with tf.Session() as sess:
    tf.global_variables_initializer().run()
    print(sess.run(grads))

[array([ 2.,  2.,  3.],dtype=float32),
 array([ 2.,  2.,  3.], dtype=float32), 
 array([ 5.,  4.,  7.], dtype=float32), 
 array([ 3.,  2.,  4.], dtype=float32)]

tf.stop_gradient()

阻擋節點BP的梯度

import tensorflow as tf

w1 = tf.Variable(2.0)
w2 = tf.Variable(2.0)

a = tf.multiply(w1, 3.0)
a_stoped = tf.stop_gradient(a)

# b=w1*3.0*w2
b = tf.multiply(a_stoped, w2)
gradients = tf.gradients(b, xs=[w1, w2])
print(gradients)
#輸出
#[None, <tf.Tensor 'gradients/Mul_1_grad/Reshape_1:0' shape=() dtype=float32>]

可見，一個節點被 stop之後，這個節點上的梯度，就無法再向前BP了。由於w1變量的梯度只能來自a節點，所以，計算梯度返回的是None。

a = tf.Variable(1.0)
b = tf.Variable(1.0)

c = tf.add(a, b)

c_stoped = tf.stop_gradient(c)

d = tf.add(a, b)

e = tf.add(c_stoped, d)

gradients = tf.gradients(e, xs=[a, b])

with tf.Session() as sess:
    tf.global_variables_initializer().run()
    print(sess.run(gradients))
#輸出 [1.0, 1.0]

雖然 c節點被stop了，但是a，b還有從d傳回的梯度，所以還是可以輸出梯度值的。

import tensorflow as tf

w1 = tf.Variable(2.0)
w2 = tf.Variable(2.0)
a = tf.multiply(w1, 3.0)
a_stoped = tf.stop_gradient(a)

# b=w1*3.0*w2
b = tf.multiply(a_stoped, w2)

opt = tf.train.GradientDescentOptimizer(0.1)

gradients = tf.gradients(b, xs=tf.trainable_variables())

tf.summary.histogram(gradients[0].name, gradients[0])# 這裏會報錯，因爲gradients[0]是None
#其它地方都會運行正常，無論是梯度的計算還是變量的更新。總覺着tensorflow這麼設計有點不好，
#不如改成流過去的梯度爲0
train_op = opt.apply_gradients(zip(gradients, tf.trainable_variables()))

print(gradients)
with tf.Session() as sess:
    tf.global_variables_initializer().run()
    print(sess.run(train_op))
    print(sess.run([w1, w2]))

高階導數

tensorflow 求高階導數可以使用 tf.gradients 來實現

import tensorflow as tf

with tf.device('/cpu:0'):
    a = tf.constant(1.)
    b = tf.pow(a, 2)
    grad = tf.gradients(ys=b, xs=a) # 一階導
    print(grad[0])
    grad_2 = tf.gradients(ys=grad[0], xs=a) # 二階導
    grad_3 = tf.gradients(ys=grad_2[0], xs=a) # 三階導
    print(grad_3)

with tf.Session() as sess:
    print(sess.run(grad_3))

Note: 有些 op，tf 沒有實現其高階導的計算，例如 tf.add …, 如果計算了一個沒有實現高階導的 op的高階導， gradients 會返回 None。

TensorFlow學習3：tf.gradients和tf.stop_gradient()

本文轉載自http://blog.csdn.net/u012436149/article/details/53905797

gradient

tf.stop_gradient()

高階導數

[轉帖]使用NMT和pmap解決JVM資源泄漏問題原創

Python實現大麥網搶票的四大關鍵技術點解析

Python 安裝庫指令大全

salesforce零基礎學習（一百三十八）零碎知識點小總結（十）

一款開源的.NET程序集反編譯、編輯和調試神器

關於接口協議，你必須要知道這些！

2020年上半年數據庫系統工程師考試

基於 Milvus + LlamaIndex 實現高級 RAG

【2024-05-21】以茶會友

mxnet學習(8):Trainer

mxnet學習(5):模型參數

論文：Aurora Guard_ Real-Time Face Anti-Spoofing via Light Reflection

mxnet學習(9):使用gluon接口讀取symbol預訓練模型finetune

python筆記(4): os.path模塊

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結