TensorFlow學習筆記(一)之邏輯斯地迴歸模型及Cifar-10分類的實現

目錄

環境

介紹

機器學習步驟

深度學習、機器學習、人工智能三者的關係

神經網絡

二分類邏輯斯地迴歸模型

多分類邏輯斯地迴歸模型

目標函數(損失函數)

主要類型

舉例

神經網絡訓練

訓練目標

梯度下降算法

TensorFlow實現

計算圖模型

命令式編程

聲明式編程

二者的對比

數據處理

下載數據

準備工作

讀取數據

查看數據

數據讀取及預處理整體代碼

構建模型

構建計算圖

構建模型整體代碼

初始化及運行模型

整體代碼

注意事項

參考資料


環境

python 3.6 + TensorFlow 1.13.1 + Jupyter Notebook

介紹

機器學習步驟

  1. 數據預處理(採集+去噪);
  2. 模型訓練(特徵提取+建模);
  3. 模型評估與優化(loss、accuracy及調參);
  4. 模型應用。

深度學習、機器學習、人工智能三者的關係

引自:https://coding.imooc.com/class/259.html

神經網絡

神經元是最小的神經網絡,以3個神經元爲例,結構爲:

其表達式爲:

h_{w,b}(x)=f(W^{^{T}}x)=f(\sum_{3}^{i=1}W_{i}x_{i}+b)
其中,W^{^{T}}爲權重,x爲特徵,f()爲激活函數,b爲偏置。

計算舉例:

若:x = [1,2,3];

W = [0.1,0.2,0.3];

h(W*x) = h(a) = a/10;

則:W*x = 1*0.1+2*0.2+3*0.3 = 1.4

h(W*x) = h(1.4) = 1.4/10 = 0.14

二分類邏輯斯地迴歸模型

引自:https://coding.imooc.com/class/259.html

多分類邏輯斯地迴歸模型

目標函數(損失函數)

目標函數用於衡量對數據的擬合程度。

主要類型

1、二分類:真實值-預測值;

2、多分類:abs(真實值做one-hot編碼-預測的概率分佈);

3、平方差損失,表達式爲:

\frac{1}{n}\sum_{x,y}^{ }\frac{1}{2}(y-Model(x))^{2}

4、交叉熵損失(更適合做多分類的損失函數),表達式爲:

\frac{1}{n}\sum_{x,y}^{ }yln(Model(x))

注意:多分類在計算目標函數時可以通過one-hot編碼實現。

One-hot編碼:數值到向量的變換,只有一個位置爲1,其他位置均爲0。

舉例

1、二分類:

2、多分類:

神經網絡訓練

訓練目標

調整參數使得模型在訓練集上的損失函數最小。

梯度下降算法

下山算法:找到方向;走一步。引自:https://coding.imooc.com/class/259.html

梯度下降算法與下山算法思想類似:

\theta =\theta -\frac{\partial L(x,y)}{\partial \theta }

學習率(步長):\alpha,人爲設置的,不能過大、過小;

方向:\frac{\partial L(x,y)}{\partial \theta }

學習率的影響如下圖所示:

引自:https://coding.imooc.com/class/259.html

TensorFlow實現

計算圖模型

命令式編程

聲明式編程

先構建圖,再填入數據計算。

二者的對比

引自:https://coding.imooc.com/class/259.html

數據處理

下載數據

以CIFAR10爲例,下載鏈接:http://www.cs.toronto.edu/~kriz/cifar.html

準備工作

需要安裝包:

在python 2.x中,安裝cPickle;

pip install cPickle

在python 3.x中,安裝Pickle(建議);

pip install Pickle

注意:python 3.x也可以用_pickle代替Pickle包(不建議,親測後面程序報錯,不知道是不是這個包的數據導入問題):

import _pickle as cPickle

讀取數據

import os
import numpy as np
import tensorflow as tf
# import _pickle as cPickle
import pickle

cifar_dir = 'dataset/cifar-10-batches-py/'

print(os.listdir(cifar_dir))

運行結果:

查看數據

查看數據結構:

with open(os.path.join(cifar_dir, 'data_batch_1'), 'rb') as f:
    data = cPickle.load(f, encoding='bytes')
    print(type(data))
    
    print(type(data[b'batch_label']))
    print(type(data[b'labels']))
    print(type(data[b'data']))
    print(type(data[b'filenames']))
    print(data[b'data'].shape) # 32 * 32 = 1024 * 3 = 3072
    print(data[b'data'][0:2])
    print(data[b'labels'][0:2])
    print(data[b'batch_label'])
    print(data[b'filenames'][0:2])

運行結果:

查看某一張圖:

img_arr = data[b'data'][100]
img_arr = img_arr.reshape((3,32,32))
img_arr = img_arr.transpose((1,2,0))

import matplotlib.pyplot as plt
from matplotlib.pyplot import imshow
%matplotlib inline

imshow(img_arr)

運行結果:

注意:這裏需要轉換通道,不然圖片無法正常顯示。

原數據集的每張圖的格式爲:[32, 32, 3] -> 32 * 32 * 3 = 1024 * 3 = 3072

而我們顯示出來的圖需要的格式爲:[3, 32, 32] ->  3 * 32 * 32 = 3 * 1024 = 3072

數據讀取及預處理整體代碼

import os
import numpy as np
import tensorflow as tf
# import _pickle as cPickle
import pickle

cifar_dir = 'dataset/cifar-10-batches-py/'
# cifar_dir = 'I:/jupyterWorkDir/testTensorFlow/code/coding-others/cifar-10-batches-py/'

print(os.listdir(cifar_dir))
CIFAR_DIR = cifar_dir



def load_data(filename):
    """read data from data file."""
    with open(filename, 'rb') as f:
        data = pickle.load(f, encoding='bytes')
        return data[b'data'], data[b'labels']

# tensorflow.Dataset.
class CifarData:
    def __init__(self, filenames, need_shuffle):
        all_data = []
        all_labels = []
        for filename in filenames:
            data, labels = load_data(filename)
            all_data.append(data)
            all_labels.append(labels)
        self._data = np.vstack(all_data)
        self._data = self._data / 127.5 - 1
        self._labels = np.hstack(all_labels)
        print(self._data.shape)
        print(self._labels.shape)
        
        self._num_examples = self._data.shape[0]
        self._need_shuffle = need_shuffle
        self._indicator = 0
        if self._need_shuffle:
            self._shuffle_data()
            
    def _shuffle_data(self):
        # [0,1,2,3,4,5] -> [5,3,2,4,0,1]
        p = np.random.permutation(self._num_examples)
        self._data = self._data[p]
        self._labels = self._labels[p]
    
    def next_batch(self, batch_size):
        """return batch_size examples as a batch."""
        end_indicator = self._indicator + batch_size
        if end_indicator > self._num_examples:
            if self._need_shuffle:
                self._shuffle_data()
                self._indicator = 0
                end_indicator = batch_size
            else:
                raise Exception("have no more examples")
        if end_indicator > self._num_examples:
            raise Exception("batch size is larger than all examples")
        batch_data = self._data[self._indicator: end_indicator]
        batch_labels = self._labels[self._indicator: end_indicator]
        self._indicator = end_indicator
        return batch_data, batch_labels

train_filenames = [os.path.join(CIFAR_DIR, 'data_batch_%d' % i) for i in range(1, 6)]
test_filenames = [os.path.join(CIFAR_DIR, 'test_batch')]

train_data = CifarData(train_filenames, True)
test_data = CifarData(test_filenames, False)

構建模型

構建計算圖

構建x和y,x爲輸入的數據,y爲標籤(label),placeholder理解爲佔位符。

# (None, 3072)
x = tf.placeholder(tf.float32, [None, 3072])
# (None)
y = tf.placeholder(tf.int64, [None])

構建隱含層:

hidden1 = tf.layers.dense(x, 100, activation=tf.nn.relu)
hidden2 = tf.layers.dense(hidden1, 100, activation=tf.nn.relu)
hidden3 = tf.layers.dense(hidden2, 50, activation=tf.nn.relu)

構建w,b和_y,其中w爲權重,b爲偏置(bias),_y爲預測值。

# (3072, 1)
w = tf.get_variable('w', [x.get_shape()[-1], 1],
                   initializer = tf.random_normal_initializer(0,1))
# (1)
b = tf.get_variable('b', [1],
                   initializer = tf.constant_initializer(0.0))
# (None, 3072) * (3072, 1) = (None. 1)
y_ = tf.matmul(x,w) + b

這一步等價於:

y_ = tf.layers.dense(hidden3, 10)

構建預測值的概率分佈(p_y_1)和loss(平方差損失)。

# 得到y=1的概率
# (None, 1)
p_y_1 = tf.nn.sigmoid(y_)

# 計算loss (平方差損失)
# (None, 1)
y_reshape = tf.reshape(y, (-1, 1))
y_reshape_float = tf.cast(y_reshape, float32)
loss = tf.reduce_mean(tf.square(y_reshape_float, p_y_1))

這一步等價於:

loss = tf.losses.sparse_softmax_cross_entropy(labels=y, logits=y_)

構建accuracy:

# 計算accuracy
# bool
predict = p_y_1 > 0.5
# bool [0,0,1,1,1,0,1,1,1]
correct_prediction = tf.equal(y_reshape_float, tf.cast(predict, float32))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, float64))

這一步等價於:

# indices
predict = tf.argmax(y_, 1)
# [1,0,1,1,1,0,0,0]
correct_prediction = tf.equal(predict, y)
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float64))

調整learning rate,優化loss:

# (1e-3)是初始化的learning rate
with tf.name_scope('train_op'):
    train_op = tf.train.AdamOptimizer(1e-3).minimize(loss)

構建模型整體代碼

# 構建計算圖
# (None, 3072)
x = tf.placeholder(tf.float32, [None, 3072])
# (None)
y = tf.placeholder(tf.int64, [None])

hidden1 = tf.layers.dense(x, 100, activation=tf.nn.relu)
hidden2 = tf.layers.dense(hidden1, 100, activation=tf.nn.relu)
hidden3 = tf.layers.dense(hidden2, 50, activation=tf.nn.relu)

'''# (3072, 1)
w = tf.get_variable('w', [x.get_shape()[-1], 1],
                   initializer = tf.random_normal_initializer(0,1))
# (1)
b = tf.get_variable('b', [1],
                   initializer = tf.constant_initializer(0.0))
# (None, 3072) * (3072, 1) = (None. 1)
y_ = tf.matmul(x,w) + b'''
y_ = tf.layers.dense(hidden3, 10)

'''# 得到y=1的概率
# (None, 1)
p_y_1 = tf.nn.sigmoid(y_)

# 計算loss (平方差損失)
# (None, 1)
y_reshape = tf.reshape(y, (-1, 1))
y_reshape_float = tf.cast(y_reshape, float32)
loss = tf.reduce_mean(tf.square(y_reshape_float, p_y_1))'''
loss = tf.losses.sparse_softmax_cross_entropy(labels=y, logits=y_)

'''# 計算accuracy
# bool
predict = p_y_1 > 0.5
# bool [0,0,1,1,1,0,1,1,1]
correct_prediction = tf.equal(y_reshape_float, tf.cast(predict, float32))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, float64))'''
# indices
predict = tf.argmax(y_, 1)
# [1,0,1,1,1,0,0,0]
correct_prediction = tf.equal(predict, y)
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float64))

'''# 梯度下降法優化loss
# (1e-3)是初始化的learning rate
# AdadeltaOptimizer是梯度下降的變種,用於調整learning rate,
# 這是在loss上做,優化最小化的loss值
with tf.name_scope('train_op'):
    train_op = tf.train.AdadeltaOptimizer(1e-3).minimize(loss)'''
with tf.name_scope('train_op'):
    train_op = tf.train.AdamOptimizer(1e-3).minimize(loss)

初始化及運行模型

整體代碼

# 初始化
init = tf.global_variables_initializer()
batch_size = 20
train_steps = 100000
test_steps = 100

# run 100k: 50.5%
with tf.Session() as sess:
    sess.run(init)
    for i in range(train_steps):
        batch_data, batch_labels = train_data.next_batch(batch_size)
        loss_val, acc_val, _ = sess.run(
            [loss, accuracy, train_op],
            feed_dict={
                x: batch_data,
                y: batch_labels})
        if (i+1) % 500 == 0:
            print('[Train] Step: %d, loss: %4.5f, acc: %4.5f'
                  % (i+1, loss_val, acc_val))
        if (i+1) % 5000 == 0:
            test_data = CifarData(test_filenames, False)
            all_test_acc_val = []
            for j in range(test_steps):
                test_batch_data, test_batch_labels \
                    = test_data.next_batch(batch_size)
                test_acc_val = sess.run(
                    [accuracy],
                    feed_dict = {
                        x: test_batch_data, 
                        y: test_batch_labels
                    })
                all_test_acc_val.append(test_acc_val)
            test_acc = np.mean(all_test_acc_val)
            print('[Test ] Step: %d, acc: %4.5f'
                  % (i+1, test_acc))

注意事項

1、有時候報莫名其妙的錯,建議先檢查python版本和運行環境,我之前就是環境運行錯了,改錯改到懷疑人生,附上代碼:

# 查看python版本及運行環境的路徑
import sys
print(sys.version)
print(sys.executable)

2、在python 2.x中,讀取數據集爲:

def load_data(filename):
    """read data from data file."""
    with open(filename, 'rb') as f:
        data = cPickle.load(f)
        return data['data'], data['labels']

在python 3.x中,讀取數據集爲:

def load_data(filename):
    """read data from data file."""
    with open(filename, 'rb') as f:
        data = pickle.load(f, encoding='bytes')
        return data[b'data'], data[b'labels']

(1)在python 3.x中,如果沒有加encoding則會報錯:

'ascii' codec can't decode byte 0x8b in position 6: ordinal not in range(128)

(2)在python 3.x中,data['']如果沒有加'b'則會報錯:KeyError

參考資料

圖片、教程及內容:https://coding.imooc.com/class/259.html

API幫助文檔:http://www.tensorfly.cn/tfdoc/api_docs/index.html

 

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章