[tensorflow] 線性迴歸模型實現

在這一篇博客中大概講一下用tensorflow如何實現一個簡單的線性迴歸模型，其中就可能涉及到一些tensorflow的基本概念和操作，然後因爲我只是入門了點tensorflow，所以我只能對部分代碼給出相關的tensorflow的概念。

線性迴歸模型的表達式如下：

$y=\vec{w}*\vec{x}+\vec{b}$

其中， $\vec{w}$ 是權重， $\vec{b}$ 是偏置， $\vec{x}$ 和則是輸入數據和對應的模型預測值。

在tensorflow中，是用圖來表示計算的形式的，圖中的每個節點稱爲一個op(即operation)，每個operation獲得相關張量(Tensor)後進行數值計算，每個張量就是一個類型化的多維數組，用圖表示了計算的形式後並不會計算和得到結果，需要在會話(session)中執行(run)纔會進行計算並得到結果。

要實現上面的式子，一般是要先聲明好相關參數的張量，代碼如下：

# 設置tensorflow圖模型的輸入
X = tf.placeholder("float")
Y = tf.placeholder("float")

上面兩句代碼聲明瞭模型的輸入數據和對應的真是值(或者叫標籤)，tf.placeholder()可以理解爲爲變量設置形參，在運算的時候再進行賦值，函數原型爲：

placeholder(dtype, shape=None, name=None):

有了輸入數據之後，我們還需要先聲明好權值和偏置，代碼如下：

# 設置模型權重和偏置
W = tf.Variable(rng.randn(), name="weight")
b = tf.Variable(rng.randn(), name="bias")

這兩句定義了兩個變量，並且將兩個變量隨機初始化，並設置名稱，tf.Variable()函數是tensorflow定義變量的的函數。

有了前面的變量之後，我們需要構造計算的表達式，這裏就涉及到tensorflow中的一些基本運算：

tf.add() #相加
# 算術操作符：+ - * / %
tf.add(x, y, name=None) # 加法(支持 broadcasting)
tf.subtract(x, y, name=None) # 減法
tf.multiply(x, y, name=None) # 乘法
tf.divide(x, y, name=None) # 浮點除法, 返回浮點數(python3 除法) 
tf.mod(x, y, name=None) # 取餘

有了上述基本的算術操作之後構造一個線性迴歸模型的表達式就簡單了：

# Construct a linear model
pred = tf.add(tf.multiply(X, W), b)

現在表達式已經有了，在訓練的過程中我們還需要一個損失函數來衡量模型的誤差並更新參數，線性迴歸模型比較常用的損失函數是均方誤差(mean squared error)：

cost = tf.reduce_sum(tf.pow(pred-Y, 2))/(2*N_samples)

優化函數這裏選擇最常見的隨機梯度下降法(SGD)，這個在tensorflow中是內置的，可以直接調用。

好了，這樣基本的架構有了，就需要讓訓練跑起來，這個時候就需要構造一個會話(Session)，會話(session)擁有並管理TensorFlow 程序運行時的所有資源。當所有計算完成之後需要關閉會話來幫助系統回收資源，否則就可能出現資源泄漏的問題。創建會話的方式有兩種：

1、第一種模式需要明確調用會話生成函數和關閉會話函數:

sess = tf.Session()
... 
sess.close()

2、第一種模式在所有計算完成之後，需要明確調用Session.close 函數來關閉會話並釋放資源。然而，當程序因爲異常而退出時，關閉會話的函數可能就不會被執行從而導致資源泄漏。爲了解決異常退出時資源釋放的問題，TensorFlow 可以通過Python 的上下文管理器來使用會話:

with tf.Session() as sess:
    ...

所以，這裏我們採用第二種方式是比較合適的。

接着就是一個迭代過程，在每次迭代開始之後，我們需要把數據填充到前面所聲明的兩個佔位符X和Y中，並執行優化算法：

sess.run(optimizer, feed_dict={X: x, Y: y})

完整的代碼如下：

from __future__ import print_function

import tensorflow as tf
import numpy
import matplotlib.pyplot as plt
import numpy as np
import matplotlib.pyplot as plt
from sklearn import datasets as skd
from sklearn.model_selection import train_test_split
rng = numpy.random

# 設置初始訓練參數
learning_rate = 0.01
training_epochs = 5000
display_step = 50

# 從sklearn中加載波士頓房價數據集，該數據集的數據爲[506，13]，標籤爲[506]
data = skd.load_boston()
print("data.shape = ", data.data.shape)
print("target.shape = ", data.target.shape)

# 將數據集的數據和標籤分離
#X_data = data.data[:,12]
X_data = data.data[:,12]
Y_data = data.target
print("X_data.shape = ", X_data.shape)
print("Y_data.shape = ", Y_data.shape)
print("X_data[0:20, 12] = ", X_data[0:20])
print("Y_data[0:20, 12] = ", Y_data[0:20])

# 將數據和標籤分成訓練集和測試集
x_train,x_test,y_train,y_test = train_test_split(X_data,Y_data,test_size=0.3,random_state=0)
print("x_train.shape = ", x_train.shape)
print("x_test.shape = ", x_test.shape)
print("y_train.shape = ", y_train.shape)
print("y_test.shape = ", y_test.shape)

N_sample = x_train.shape[0]

# 構建圖模型計算的輸入
X = tf.placeholder("float")
Y = tf.placeholder("float")

# 定義權重和偏置
W = tf.Variable(rng.randn(), name="weight")
b = tf.Variable(rng.randn(), name="bias")

# 構件圖模型結構
y = tf.add(tf.multiply(W, X), b)

# 定義損失函數
cost = tf.reduce_sum(tf.pow(y-Y, 2)/2/N_sample)

# 定義優化函數
optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate).minimize(cost)

# 定義初始化器
init = tf.global_variables_initializer()

# 在會話中運行圖模型計算
with tf.Session() as sess:
    # 執行初始化
    sess.run(init)

    # 開始迭代訓練
    for epoch in range(training_epochs):
        for (x, y) in zip(x_train, y_train):
            sess.run(optimizer, feed_dict={X: x, Y: y})
        
        if (epoch+1) % display_step == 0:
            c = sess.run(cost, feed_dict={X: x_train, Y:y_train})
            print("Epoch:", '%04d' % (epoch+1), "cost=", "{:.9f}".format(c), "W=", sess.run(W), "b=", sess.run(b))
    
    # 完成訓練
    print("訓練完成")
    training_cost = sess.run(cost, feed_dict={X: x_train, Y: y_train})
    print("訓練誤差=", training_cost, "W=", sess.run(W), "b=", sess.run(b), '\n')

    # 畫圖顯示
    plt.plot(x_train, y_train, 'ro', label='Original data')
    plt.plot(x_train, sess.run(W) * x_train + sess.run(b), label='Fitted line')
    plt.legend()
    plt.show()

    # 開始測試數據
    cost_test = tf.reduce_sum(tf.pow(y - Y, 2)) / (2 * x_test.shape[0])
    testing_cost = sess.run(cost_test, feed_dict={X: x_test, Y: y_test})

    print("測試=", testing_cost)
    print("Absolute mean square loss difference:", abs(training_cost - testing_cost))

    plt.plot(x_test, y_test, 'bo', label='Testing data')
    plt.plot(x_train, sess.run(W) * x_train + sess.run(b), label='Fitted line')
    plt.legend()
    plt.show()

這裏的數據採用的是sklearn提供的波士頓房價數據，這個數據集是一個506*13維的數據集，我提取了其中的第二軸的第12維的數據來訓練和預測，然後將數據集分成訓練集和測試集，比例是7：3，代碼運行中間訓練信息：

訓練數據和擬合的表達式如下：

測試數據集和表達式如下：

前面講的是一元的線性迴歸模型，下面就講一下多元線性迴歸模型的實現。數據依然是sklearn內置的波士頓房價數據，這個數據有506個樣本，每個樣本有13維特徵，跟前面一樣，這裏先加載數據進來，然後切分爲訓練集和測試集：

boston = skd.load_boston()
X_data = boston.data
Y_data = boston.target
x_train,x_test,y_train,y_test = train_test_split(X_data,Y_data,test_size=0.3,random_state=0)
x_train = scale(x_train)
x_test = scale(x_test)
y_train = scale(y_train.reshape((-1,1)))
y_test = scale(y_test.reshape((-1,1)))

然後，同樣的，定義圖模型計算的輸入和標籤：

X = tf.placeholder(tf.float32, [None, 13])
Y = tf.placeholder(tf.float32, [None, 1])

定義權重和偏置：

W = tf.Variable(tf.random_normal([13, 1]),dtype=tf.float32, name="weight")
b = tf.Variable(tf.random_normal([1]),dtype=tf.float32, name="bias")

然後定義模型結構表達式和損失函數：

# 注意這裏是矩陣相乘，所以要用tf.matmul
y = tf.add(tf.matmul(X, W), b)
cost = tf.reduce_mean(tf.square(Y-y))

剩下的操作其實就跟前面類似了，不過這裏要說一下，如果像保存訓練好的模型，那可以這麼做：

saver = tf.train.Saver()     with tf.Session() as sess:
    # 訓練
    saver.save(sess, "file_path/model_name.ckpt")

保存模型之後，如果要加載模型的話，可以這麼做：

saver = tf.train.Saver()
with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    saver.restore(sess,"file_path/")# 注意這裏，只要到目錄就可以
    sess.run(...)

完整的代碼如下：

from __future__ import print_function    
import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt
from sklearn import datasets as skd
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import scale
rng = numpy.random

# 設置初始訓練參數
learning_rate = 0.001
training_epochs = 100000
display_step = 50

# 從sklearn中加載波士頓房價數據集，該數據集的數據爲[506，13]，標籤爲[506]
boston = skd.load_boston()
# 將數據集的數據和標籤分離
X_data = boston.data
Y_data = boston.target
print("X_data.shape = ", X_data.shape)
print("Y_data.shape = ", Y_data.shape)

# 將數據和標籤分成訓練集和測試集
x_train,x_test,y_train,y_test = train_test_split(X_data,Y_data,test_size=0.3,random_state=0)
x_train = scale(x_train)
x_test = scale(x_test)
y_train = scale(y_train.reshape((-1,1)))
y_test = scale(y_test.reshape((-1,1)))
print("x_train.shape = ", x_train.shape)
print("x_test.shape = ", x_test.shape)
print("y_train.shape = ", y_train.shape)
print("y_test.shape = ", y_test.shape)

N_sample = x_train.shape[0]

# 構建圖模型計算的輸入
X = tf.placeholder(tf.float32, [None, 13])
Y = tf.placeholder(tf.float32, [None, 1])

# 定義權重和偏置
#W = tf.Variable(rng.randn(), name="weight")
#b = tf.Variable(rng.randn(), name="bias")
W = tf.Variable(tf.random_normal([13, 1]),dtype=tf.float32, name="weight")
b = tf.Variable(tf.random_normal([1]),dtype=tf.float32, name="bias")

# 構件圖模型結構
y = tf.add(tf.matmul(X, W), b)
#y = tf.matmul(X, W)+b

# 定義損失函數
#cost = tf.reduce_sum(tf.pow(y-Y, 2)/2/N_sample)
cost = tf.reduce_mean(tf.square(Y-y))

# 定義優化函數
optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate).minimize(cost)

# 定義初始化器
init = tf.global_variables_initializer()

# 創建保存模型
saver = tf.train.Saver()

# 在會話中運行圖模型計算
with tf.Session() as sess:
    # 執行初始化
    sess.run(init)
    # 開始迭代訓練
    for epoch in range(training_epochs):
        sess.run(optimizer, feed_dict={X: x_train, Y: y_train})
        
        if (epoch+1) % display_step == 0:
            c = sess.run(cost, feed_dict={X: x_train, Y:y_train})
            print("Epoch:", '%04d' % (epoch+1), "cost=", "{:.9f}".format(c), "W=", sess.run(W), "b=", sess.run(b))
    
    # 完成訓練
    saver.save(sess, "linear_regression/LiR4MultiFeatures.ckpt")
    print("訓練完成")
    training_cost = sess.run(cost, feed_dict={X: x_train, Y: y_train})
    print("訓練誤差=", training_cost, "W=", sess.run(W), "b=", sess.run(b), '\n')
    

    # 畫圖顯示
    plt.plot(x_train[:,12], y_train, 'ro', label='Original data')
    pred = sess.run(y, feed_dict={X: x_train})
    plt.plot(x_train[:,12], pred, 'bx', label='predict data')
    plt.legend()
    plt.show()

    # 開始測試數據
    cost_test = tf.reduce_sum(tf.pow(y - Y, 2)) / (2 * x_test.shape[0])
    testing_cost = sess.run(cost_test, feed_dict={X: x_test, Y: y_test})

    print("Testing cost=", testing_cost)
    print("Absolute mean square loss difference:", abs(training_cost - testing_cost))
    plt.plot(x_test[:,12], y_test, 'ro', label='Testing data')
    pred_test = sess.run(y, feed_dict={X: x_test})
    plt.plot(x_test[:,12], pred_test, 'bx', label='predicted data')
    plt.legend()
    plt.show()

# 加載本地訓練好的模型進行預測
model_file=tf.train.latest_checkpoint("linear_regression/")
with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    saver.restore(sess,model_file)
    cost_test = tf.reduce_sum(tf.pow(y - Y, 2)) / (2 * x_test.shape[0])
    testing_cost = sess.run(cost_test, feed_dict={X: x_test, Y: y_test})
    print("Testing cost=", testing_cost)
    print("Absolute mean square loss difference:", abs(training_cost - testing_cost))
    plt.plot(x_test[:,12], y_test, 'bo', label='Testing data')
    pred_test = sess.run(y, feed_dict={X: x_test})
    plt.plot(x_test[:,12], pred_test, 'rx', label='predicted data')
    plt.legend()
    plt.show()

訓練出來的效果如下：