線性迴歸tensorflow實現

原創

2020-06-13 06:15

線性迴歸tensorflow實現

xiaoyao 《動手學深度學習》 tensorflow2.1.0

import tensorflow as tf
print(tf.__version__)

# from Ipython import display
from matplotlib import pyplot as plt
import random
%matplotlib inline

import warnings
warnings.filterwarnings('ignore')

2.1.0

3.2.1 生成數據集

利用tensor和GradientTape實現一個線性迴歸的訓練

設訓練集樣本數爲1000，特徵數爲2. 給定生成的批量樣本特徵 $\boldsymbol{X}\in\mathbb{R}^{1000\times 2}$ ,使用線性迴歸模型真實權重爲: $\boldsymbol{w}=[2, -3.4]$ 和偏差 $b=4.2$ ，以及一個噪聲項 $\epsilon$ 來生成標籤：
$\boldsymbol{\mathcal{Y}}=\boldsymbol{X}\boldsymbol{w}+b+\epsilon\tag{式1}$
其中噪聲項 $\epsilon$ 服從均值爲0，標準差爲0.01的正態分佈。噪聲代表了數據集中無意義的干擾。

num_inputs = 2
num_examples = 1000

true_w = [2, -3.4]
true_b = 4.2

features = tf.random.normal((num_examples, num_inputs),stddev = 1)
labels = true_w[0] * features[:,0] + true_w[1] * features[:,1] + true_b
labels += tf.random.normal(labels.shape,stddev=0.01)

features[0], labels[0]

(<tf.Tensor: shape=(2,), dtype=float32, numpy=array([-1.1710224,  1.3963023], dtype=float32)>,
 <tf.Tensor: shape=(), dtype=float32, numpy=-2.8871787>)

這裏，features的每一行是長度爲2的向量，而labels的每一行是長度爲1的向量（標量）

print(features[0], labels[0])

tf.Tensor([-1.1710224  1.3963023], shape=(2,), dtype=float32) tf.Tensor(-2.8871787, shape=(), dtype=float32)

"""
生成第二個特徵features[:,1]和標籤labels的散點圖，這裏可以更加直觀的反映出
兩者之間的線性關係
"""
def set_figsize(figsize=(3.5, 2.5)):
    plt.rcParams['figure.figsize'] = figsize

set_figsize()
plt.scatter(features[:, 1], labels, 1) # 0設置爲透明，1設置爲不透明

<matplotlib.collections.PathCollection at 0x246120031c8>

3.2.2 讀取數據

在每次訓練模型的時候，需要遍歷數據集同時需要不斷讀取小批量數據樣本。定義一個
函數，每次返回batch_size（批量大小）個隨機樣本的特徵和標籤。

def data_iter(batch_size, features, labels):
    num_examples = len(features)
    indices = list(range(num_examples))
    
    random.shuffle(indices)
    for i in range(0, num_examples, batch_size):
        j = indices[i: min(i+batch_size, num_examples)]
        yield tf.gather(features, axis=0, indices=j), tf.gather(labels, axis=0, indices=j)

batch_size = 10

for X, y in data_iter(batch_size, features, labels):
    print(X, y)
    break

tf.Tensor(
[[-1.6544346  -0.21689671]
 [-1.9006996  -0.60220146]
 [ 0.47630018  0.16197595]
 [ 0.6305033   0.06080767]
 [-0.38096488 -0.72073525]
 [-1.8225852   0.7859634 ]
 [ 2.5961044   2.010577  ]
 [ 0.23788874 -0.70898366]
 [ 1.5793388  -0.97015315]
 [-2.655049   -0.67389566]], shape=(10, 2), dtype=float32) tf.Tensor(
[ 1.6189524  2.4591262  4.5916023  5.2634425  5.903998  -2.1094294
  2.5460227  7.093977  10.664606   1.1736389], shape=(10,), dtype=float32)

3.2.3 初始化模型參數

將權重初始化爲均值爲零，標準差爲0.01的正態分佈隨機數，偏差初始化爲0

w = tf.Variable(tf.random.normal((num_inputs, 1), stddev=0.01))
b = tf.Variable(tf.zeros((1,)))

3.2.4 定義模型define model

# 利用矢量計算表達式實現，使用tf.matmul函數做矩陣乘法
def linreg(X, w, b):
    return tf.matmul(X, w) + b

3.2.5 定義損失函數define loss

# 這裏需要將真實值y變形reshape成y_hat的形狀。此函數返回的結果將和y_hat的形狀
# 一致
def squared_loss(y_hat, y):
    return (y_hat - tf.reshape(y, y_hat.shape)) ** 2 /2  
    # tf.reshape(tensor, shape, name=None) 函數說明

3.2.6 定義優化算法define optimization

"""
實現小批量梯度下降。通過不斷迭代模型參數來優化損失函數，這裏自動求梯度模塊
計算得來的梯度是一個批量樣本的梯度和，將其除以批量大小來得到平均值。
"""
def sgd(params, lr, batch_size, grads):
    """Mini-batch stochastic gradient descent."""
    for i, param in enumerate(params):
        param.assign_sub(lr * grads[i] / batch_size)

3.2.7 訓練模型training

在每次迭代中，根據當前讀取的小批量數據樣本（特徵x和標籤y）,通過調用反向函數t.gradients計算小批量隨機梯度，同時調用優化算法sgd迭代模型參數。由於之前設置的批量大小爲batch_size爲10，每個小批量的損失l的形狀爲（10， 1）。

由於變量l不是一個標量，所以這裏調用reduce_sum()將其求和得到一個標量，再運行t.gradients得到該變量關於模型參數的梯度。每次更新完參數之後將參數的梯度清零。

在一個迭代週期（epoch）中，我們將完整遍歷一遍data_iter函數，並對訓練數據集中所有樣本都使用一次（假設樣本數能夠被批量大小整除）。這裏的迭代週期個數num_epochs和學習率lr都是超參數，分別設3和0.03。在實踐中，大多超參數都需要通過反覆試錯來不斷調節。雖然迭代週期數設得越大模型可能越有效，但是訓練時間可能過長。

lr = 0.03
num_epochs = 3
net = linreg
loss = squared_loss

for epoch in range(num_epochs):
    for X, y in data_iter(batch_size, features, labels):
        with tf.GradientTape() as t:
            t.watch([w,b])
            l = loss(net(X, w, b), y)
        grads = t.gradient(l, [w, b])
        sgd([w, b], lr, batch_size, grads)
    train_l = loss(net(features, w, b), labels)
    print('epoch %d, loss %f' % (epoch + 1, tf.reduce_mean(train_l)))

epoch 1, loss 0.033065
epoch 2, loss 0.000111
epoch 3, loss 0.000050

# 訓練完成之後，可以將學習到的參數和用來生成訓練集的真實參數進行比較
true_w, w

([2, -3.4],
 <tf.Variable 'Variable:0' shape=(2, 1) dtype=float32, numpy=
 array([[ 2.000201 ],
        [-3.4001694]], dtype=float32)>)

true_b, b

(4.2,
 <tf.Variable 'Variable:0' shape=(1,) dtype=float32, numpy=array([4.199146], dtype=float32)>)

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

線性迴歸tensorflow實現

線性迴歸tensorflow實現

3.2.1 生成數據集

3.2.2 讀取數據

3.2.3 初始化模型參數

3.2.4 定義模型define model

3.2.5 定義損失函數define loss

3.2.6 定義優化算法define optimization

3.2.7 訓練模型training

認知提升的方法

C#開源的兩款功能強大的錄屏神器

螞蟻面試：Springcloud核心組件的底層原理，你知道多少？

前端 Vue yarn.lock文件：詳解和使用指南

tensorflow2簡潔實現softmax迴歸

tensorflow數據操作

2數據分析庫pandas的使用

SVR模型&python應用

特徵工程中常用的數據處理方式

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結