因爲,tensorflow 2.0 正式版將要發佈,爲了將之前TF1.X的代碼,能在更新後兼容,開始着手使用 TF 2.0 的方式構建並訓練模型;因爲 tensorflow 2.0 正式版尚未正式發佈,但因爲 2.0 中的很多功能,是基於 TF1.14 中的 v2 模塊進行完善的,而且 1.14 版本已經非常穩定,所以,使用 TF 1.14 來完成從TF 1.X到 TF 2.0 的過渡。但因爲 TF 2.0 默認是 Earge 模式運行,且訓練方式及API與 1.X 版本中做出了很大的改動,又因爲 TF 2.0 的相關資料,目前而言,相對較少,本文則只是對本人學習過程的記錄。
注:本文只是對本人學習過程中的示例進行記錄,若有不足之處,請各位大神不吝賜教。
本文示例代碼,以《深度學習之TensorFlow 入門、原理與進階實戰》(李金洪) 一書中,第三章的邏輯迴歸擬合二維數據中的示例代碼爲例,以Earge模式,在 TF 1.14 v2 模塊下,對模型進行重新構建;
原書中的代碼:
import tensorflow.compat.v1 as tf
import numpy as np
import matplotlib.pyplot as plt
train_X = np.linspace(-1, 1, 100)
[data_len] = train_X.shape
train_Y = 2 * train_X + np.random.randn(data_len) * 0.3 # y = 2x,但是加入了噪聲
# 創建點位符
X = tf.placeholder(tf.float32)
Y = tf.placeholder(tf.float32)
# 模型參數
W = tf.Variable(tf.random.normal([1]), name='weight')
b = tf.Variable(tf.zeros([1]), name='bias')
# 前向結構
z = tf.multiply(X, W) + b
# 反向優化
cost = tf.reduce_mean(tf.square(Y - z))
learning_rate = 0.01
# optimizer = t
optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost)
# 初始化所有變量
init = tf.global_variables_initializer()
# 定義參數
training_epochs = 20
display_step = 2
# 存放批次值和損失值
plot_data = {'batchsize': [], 'loss': []}
def moving_average(a, w=10):
if len(a) < w:
return a[:]
return [val if idx < w else sum(a[(idx-w): idx]) / w for idx, val in enumerate(a)]
# 啓動 session
with tf.Session() as sess:
sess.run(init)
# 向模型輸入數據
for epoch in range(training_epochs):
for (x, y) in zip(train_X, train_Y):
sess.run(optimizer, feed_dict={X: x, Y: y})
# 顯示訓練中的詳細信息
if epoch % display_step == 0:
loss = sess.run(cost, feed_dict={X: train_X, Y: train_Y})
print('Epoch:', epoch + 1, "cost=", loss, 'W=', sess.run(W), 'b=', sess.run(b))
if not loss == 'NA':
plot_data['batchsize'].append(epoch)
plot_data['loss'].append(loss)
print('Finished')
# 顯示模擬數據點
plt.plot(train_X, train_Y, 'ro', label='Original data')
plt.legend()
plt.show()
plt.plot(train_X, train_Y, 'ro', label='Original data')
plt.plot(train_X, sess.run(W) * train_X + sess.run(b), label='Fittedline')
plt.legend()
plt.show()
plot_data['avgloss'] = moving_average(plot_data['loss'])
plt.figure(1)
plt.subplot(211)
plt.plot(plot_data['batchsize'], plot_data['avgloss'], 'b--')
plt.xlabel('Minibatch number')
plt.ylabel('Loss')
plt.title('Minbatch run vs. Training loss')
plt.show()
以 TF 1.14 v2 中的 Earge 模式,重新構建後的代碼:
import tensorflow.compat.v2 as tf
import numpy as np
import matplotlib.pyplot as plt
tf.enable_v2_behavior()
# 創建將要進行訓練的數據
train_X = np.linspace(-1, 1, 100)
train_Y = 2 * train_X + np.random.randn(*train_X.shape) * 0.3 # y = 2x,但是加入了噪聲
w = tf.Variable(tf.random.normal([1]), dtype=tf.float32)
b = tf.Variable(tf.zeros([1]), dtype=tf.float32)
# 定義前向結構
@tf.function
def forward(x):
x = tf.cast(x, tf.float32)
return x * w + b
opt = tf.optimizers.SGD(0.01)
def run_opt(x, y):
x = tf.cast(x, tf.float32)
y = tf.cast(y, tf.float32)
with tf.GradientTape() as g:
loss = tf.reduce_mean(tf.square(forward(x) - y))
# 把要訓練的數據整合到一起
train_data = (w, b)
# 計算梯度
gradients = g.gradient(loss, train_data)
# 更新訓練值,該模型中爲 w 和 b
opt.apply_gradients(zip(gradients, train_data))
return loss
# 得益於tf2.0的動態圖特性,可以在函數中直接循環訓練
def lr():
losses = []
weights = []
biases = []
for i in range(20):
sum_loss = 0
cnt = 0
for (x, y) in zip(train_X, train_Y):
loss = run_opt(x, y)
sum_loss += loss
cnt = cnt + 1
loss = sum_loss / cnt
print('Epoch: ', i, 'loss:', loss.numpy())
losses.append(loss)
weights.append(w.numpy())
biases.append(b.numpy())
print('w=', w.numpy(), 'b=', b.numpy())
return w, losses
weight, ls = lr()
plt.plot(ls)
plt.show()
plt.plot(train_X, train_Y, 'ro', label='Original data')
plt.plot(train_X, forward(train_X), label='Fittedline')
plt.legend()
plt.show()
print('W:', weight.numpy(), 'b=', b.numpy())
z = forward(0.2)
print('w * x + b = ', z.numpy())
在 Earge模式下,可以直接在循環或函數中,進行優化,不需要像TF 1.X 中,在 Session中執行。在 Earge 模式下,訓練模型需要注意以下幾點:
1:在構建優化模型時,將所有需要進行訓練的數據進行整合,然後進行梯度計算,如下圖所示
2:在 minimize()中,調用了 apply_gradients 函數,因此,不必再像 TF1.X中那樣,將優化模型,顯式地設置爲 minize()
TF1.X 中的優化方式
# 反向優化
cost = tf.reduce_mean(tf.square(Y - z))
learning_rate = 0.01
# optimizer = t
optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost)
TF 1.14 Earge 模式下的優化方式:
def run_opt(x, y):
x = tf.cast(x, tf.float32)
y = tf.cast(y, tf.float32)
with tf.GradientTape() as g:
loss = tf.reduce_mean(tf.square(forward(x) - y))
# 把要訓練的數據整合到一起
train_data = (w, b)
# 計算梯度
gradients = g.gradient(loss, train_data)
# 更新訓練值,該模型中爲 w 和 b
opt.apply_gradients(zip(gradients, train_data))
return loss
def minimize(self, loss, var_list, grad_loss=None, name=None):
"""Minimize `loss` by updating `var_list`.
This method simply computes gradient using `tf.GradientTape` and calls
`apply_gradients()`. If you want to process the gradient before applying
then call `tf.GradientTape` and `apply_gradients()` explicitly instead
of using this function.
Args:
loss: A callable taking no arguments which returns the value to minimize.
var_list: list or tuple of `Variable` objects to update to minimize
`loss`, or a callable returning the list or tuple of `Variable` objects.
Use callable when the variable list would otherwise be incomplete before
`minimize` since the variables are created at the first time `loss` is
called.
grad_loss: Optional. A `Tensor` holding the gradient computed for `loss`.
name: Optional name for the returned operation.
Returns:
An Operation that updates the variables in `var_list`. If `global_step`
was not `None`, that operation also increments `global_step`.
Raises:
ValueError: If some of the variables are not `Variable` objects.
"""
grads_and_vars = self._compute_gradients(
loss, var_list=var_list, grad_loss=grad_loss)
return self.apply_gradients(grads_and_vars, name=name)
def apply_gradients(self, grads_and_vars, name=None):
"""Apply gradients to variables.
This is the second part of `minimize()`. It returns an `Operation` that
applies gradients.
Args:
grads_and_vars: List of (gradient, variable) pairs.
name: Optional name for the returned operation. Default to the name
passed to the `Optimizer` constructor.
Returns:
An `Operation` that applies the specified gradients. If `global_step`
was not None, that operation also increments `global_step`.
Raises:
TypeError: If `grads_and_vars` is malformed.
ValueError: If none of the variables have gradients.
"""
grads_and_vars = _filter_grads(grads_and_vars)
var_list = [v for (_, v) in grads_and_vars]
with backend.name_scope(self._scope_ctx):
# Create iteration if necessary.
with ops.init_scope():
_ = self.iterations
self._create_hypers()
self._create_slots(var_list)
self._prepare(var_list)
return distribute_ctx.get_replica_context().merge_call(
self._distributed_apply,
args=(grads_and_vars,),
kwargs={"name": name})