Tensorflow 1.14 版本 Earge 模式下的線性迴歸

因爲,tensorflow 2.0 正式版將要發佈,爲了將之前TF1.X的代碼,能在更新後兼容,開始着手使用 TF 2.0 的方式構建並訓練模型;因爲 tensorflow 2.0 正式版尚未正式發佈,但因爲 2.0 中的很多功能,是基於 TF1.14 中的 v2 模塊進行完善的,而且 1.14 版本已經非常穩定,所以,使用 TF 1.14 來完成從TF 1.X到 TF 2.0 的過渡。但因爲 TF 2.0 默認是 Earge 模式運行,且訓練方式及API與 1.X 版本中做出了很大的改動,又因爲 TF 2.0 的相關資料,目前而言,相對較少,本文則只是對本人學習過程的記錄。

注:本文只是對本人學習過程中的示例進行記錄,若有不足之處,請各位大神不吝賜教。

本文示例代碼,以《深度學習之TensorFlow 入門、原理與進階實戰》(李金洪) 一書中,第三章的邏輯迴歸擬合二維數據中的示例代碼爲例,以Earge模式,在 TF 1.14 v2 模塊下,對模型進行重新構建;

原書中的代碼:

import tensorflow.compat.v1 as tf
import numpy as np
import matplotlib.pyplot as plt

train_X = np.linspace(-1, 1, 100)
[data_len] = train_X.shape
train_Y = 2 * train_X + np.random.randn(data_len) * 0.3  # y = 2x,但是加入了噪聲

# 創建點位符
X = tf.placeholder(tf.float32)
Y = tf.placeholder(tf.float32)

# 模型參數
W = tf.Variable(tf.random.normal([1]), name='weight')
b = tf.Variable(tf.zeros([1]), name='bias')
# 前向結構
z = tf.multiply(X, W) + b

# 反向優化
cost = tf.reduce_mean(tf.square(Y - z))
learning_rate = 0.01
# optimizer = t
optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost)
# 初始化所有變量
init = tf.global_variables_initializer()
# 定義參數
training_epochs = 20
display_step = 2

# 存放批次值和損失值
plot_data = {'batchsize': [], 'loss': []}


def moving_average(a, w=10):
    if len(a) < w:
        return a[:]
    return [val if idx < w else sum(a[(idx-w): idx]) / w for idx, val in enumerate(a)]


# 啓動 session
with tf.Session() as sess:
    sess.run(init)
    # 向模型輸入數據
    for epoch in range(training_epochs):
        for (x, y) in zip(train_X, train_Y):
            sess.run(optimizer, feed_dict={X: x, Y: y})

        # 顯示訓練中的詳細信息
        if epoch % display_step == 0:
            loss = sess.run(cost, feed_dict={X: train_X, Y: train_Y})
            print('Epoch:', epoch + 1, "cost=", loss, 'W=', sess.run(W), 'b=', sess.run(b))

            if not loss == 'NA':
                plot_data['batchsize'].append(epoch)
                plot_data['loss'].append(loss)

    print('Finished')
    # 顯示模擬數據點
    plt.plot(train_X, train_Y, 'ro', label='Original data')
    plt.legend()
    plt.show()

    plt.plot(train_X, train_Y, 'ro', label='Original data')
    plt.plot(train_X, sess.run(W) * train_X + sess.run(b), label='Fittedline')
    plt.legend()
    plt.show()

    plot_data['avgloss'] = moving_average(plot_data['loss'])
    plt.figure(1)
    plt.subplot(211)
    plt.plot(plot_data['batchsize'], plot_data['avgloss'], 'b--')
    plt.xlabel('Minibatch number')
    plt.ylabel('Loss')
    plt.title('Minbatch run vs. Training loss')
    plt.show()

以 TF 1.14 v2 中的 Earge 模式,重新構建後的代碼:

import tensorflow.compat.v2 as tf
import numpy as np
import matplotlib.pyplot as plt

tf.enable_v2_behavior()

# 創建將要進行訓練的數據
train_X = np.linspace(-1, 1, 100)
train_Y = 2 * train_X + np.random.randn(*train_X.shape) * 0.3  # y = 2x,但是加入了噪聲

w = tf.Variable(tf.random.normal([1]), dtype=tf.float32)
b = tf.Variable(tf.zeros([1]), dtype=tf.float32)


# 定義前向結構
@tf.function
def forward(x):
    x = tf.cast(x, tf.float32)
    return x * w + b


opt = tf.optimizers.SGD(0.01)


def run_opt(x, y):
    x = tf.cast(x, tf.float32)
    y = tf.cast(y, tf.float32)

    with tf.GradientTape() as g:
        loss = tf.reduce_mean(tf.square(forward(x) - y))

    # 把要訓練的數據整合到一起
    train_data = (w, b)
    # 計算梯度
    gradients = g.gradient(loss, train_data)
    # 更新訓練值,該模型中爲 w 和 b
    opt.apply_gradients(zip(gradients, train_data))
    return loss


# 得益於tf2.0的動態圖特性,可以在函數中直接循環訓練
def lr():
    losses = []
    weights = []
    biases = []
    for i in range(20):
        sum_loss = 0
        cnt = 0
        for (x, y) in zip(train_X, train_Y):
            loss = run_opt(x, y)
            sum_loss += loss
            cnt = cnt + 1

        loss = sum_loss / cnt
        print('Epoch: ', i, 'loss:', loss.numpy())
        losses.append(loss)
        weights.append(w.numpy())
        biases.append(b.numpy())
        print('w=', w.numpy(), 'b=', b.numpy())
    return w, losses


weight, ls = lr()
plt.plot(ls)
plt.show()

plt.plot(train_X, train_Y, 'ro', label='Original data')
plt.plot(train_X, forward(train_X), label='Fittedline')
plt.legend()
plt.show()

print('W:', weight.numpy(), 'b=', b.numpy())
z = forward(0.2)
print('w * x + b = ', z.numpy())

在 Earge模式下,可以直接在循環或函數中,進行優化,不需要像TF 1.X 中,在 Session中執行。在 Earge 模式下,訓練模型需要注意以下幾點:

1:在構建優化模型時,將所有需要進行訓練的數據進行整合,然後進行梯度計算,如下圖所示

2:在 minimize()中,調用了 apply_gradients 函數,因此,不必再像 TF1.X中那樣,將優化模型,顯式地設置爲 minize()

TF1.X 中的優化方式

# 反向優化
cost = tf.reduce_mean(tf.square(Y - z))
learning_rate = 0.01
# optimizer = t
optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost)

TF 1.14 Earge 模式下的優化方式:

def run_opt(x, y):
    x = tf.cast(x, tf.float32)
    y = tf.cast(y, tf.float32)

    with tf.GradientTape() as g:
        loss = tf.reduce_mean(tf.square(forward(x) - y))

    # 把要訓練的數據整合到一起
    train_data = (w, b)
    # 計算梯度
    gradients = g.gradient(loss, train_data)
    # 更新訓練值,該模型中爲 w 和 b
    opt.apply_gradients(zip(gradients, train_data))
    return loss
  def minimize(self, loss, var_list, grad_loss=None, name=None):
    """Minimize `loss` by updating `var_list`.

    This method simply computes gradient using `tf.GradientTape` and calls
    `apply_gradients()`. If you want to process the gradient before applying
    then call `tf.GradientTape` and `apply_gradients()` explicitly instead
    of using this function.

    Args:
      loss: A callable taking no arguments which returns the value to minimize.
      var_list: list or tuple of `Variable` objects to update to minimize
        `loss`, or a callable returning the list or tuple of `Variable` objects.
        Use callable when the variable list would otherwise be incomplete before
        `minimize` since the variables are created at the first time `loss` is
        called.
      grad_loss: Optional. A `Tensor` holding the gradient computed for `loss`.
      name: Optional name for the returned operation.

    Returns:
      An Operation that updates the variables in `var_list`.  If `global_step`
      was not `None`, that operation also increments `global_step`.

    Raises:
      ValueError: If some of the variables are not `Variable` objects.

    """
    grads_and_vars = self._compute_gradients(
        loss, var_list=var_list, grad_loss=grad_loss)

    return self.apply_gradients(grads_and_vars, name=name)
 def apply_gradients(self, grads_and_vars, name=None):
    """Apply gradients to variables.

    This is the second part of `minimize()`. It returns an `Operation` that
    applies gradients.

    Args:
      grads_and_vars: List of (gradient, variable) pairs.
      name: Optional name for the returned operation.  Default to the name
        passed to the `Optimizer` constructor.

    Returns:
      An `Operation` that applies the specified gradients. If `global_step`
      was not None, that operation also increments `global_step`.

    Raises:
      TypeError: If `grads_and_vars` is malformed.
      ValueError: If none of the variables have gradients.
    """
    grads_and_vars = _filter_grads(grads_and_vars)
    var_list = [v for (_, v) in grads_and_vars]

    with backend.name_scope(self._scope_ctx):
      # Create iteration if necessary.
      with ops.init_scope():
        _ = self.iterations
        self._create_hypers()
        self._create_slots(var_list)

      self._prepare(var_list)

      return distribute_ctx.get_replica_context().merge_call(
          self._distributed_apply,
          args=(grads_and_vars,),
          kwargs={"name": name})

 

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章