Machine Learning Crash Course | Google -前提條件和準備工作---Tensorflow

# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

TensorFlow 編程概念

學習目標:
* 學習 TensorFlow 編程模型的基礎知識,重點了解以下概念:
* 張量
* 指令
* 圖
* 會話
* 構建一個簡單的 TensorFlow 程序,使用該程序繪製一個默認圖並創建一個運行該圖的會話

注意:請仔細閱讀本教程。TensorFlow 編程模型很可能與您遇到的其他模型不同,因此可能不如您期望的那樣直觀。

概念概覽

TensorFlow 的名稱源自張量,張量是任意維度的數組。藉助 TensorFlow,您可以操控具有大量維度的張量。即便如此,在大多數情況下,您會使用以下一個或多個低維張量:

  • 標量是零維數組(零階張量)。例如,\'Howdy\'5
  • 矢量是一維數組(一階張量)。例如,[2, 3, 5, 7, 11][5]
  • 矩陣是二維數組(二階張量)。例如,[[3.1, 8.2, 5.9][4.3, -2.7, 6.5]]

TensorFlow 指令會創建、銷燬和操控張量。典型 TensorFlow 程序中的大多數代碼行都是指令。

TensorFlow (也稱爲計算圖數據流圖)是一種圖數據結構。很多 TensorFlow 程序由單個圖構成,但是 TensorFlow 程序可以選擇創建多個圖。圖的節點是指令;圖的邊是張量。張量流經圖,在每個節點由一個指令操控。一個指令的輸出張量通常會變成後續指令的輸入張量。TensorFlow 會實現延遲執行模型,意味着系統僅會根據相關節點的需求在需要時計算節點。

張量可以作爲常量變量存儲在圖中。您可能已經猜到,常量存儲的是值不會發生更改的張量,而變量存儲的是值會發生更改的張量。不過,您可能沒有猜到的是,常量和變量都只是圖中的一種指令。常量是始終會返回同一張量值的指令。變量是會返回分配給它的任何張量的指令。

要定義常量,請使用 tf.constant 指令,並傳入它的值。例如:

  x = tf.constant([5.2])

同樣,您可以創建如下變量:

  y = tf.Variable([5])

或者,您也可以先創建變量,然後再如下所示地分配一個值(注意:您始終需要指定一個默認值):

  y = tf.Variable([0])
  y = y.assign([5])

定義一些常量或變量後,您可以將它們與其他指令(如 tf.add)結合使用。在評估 tf.add 指令時,它會調用您的 tf.constanttf.Variable 指令,以獲取它們的值,然後返回一個包含這些值之和的新張量。

圖必須在 TensorFlow 會話中運行,會話存儲了它所運行的圖的狀態:

將 tf.Session() 作爲會話:
  initialization = tf.global_variables_initializer()
  print y.eval()

在使用 tf.Variable 時,您必須在會話開始時調用 tf.global_variables_initializer,以明確初始化這些變量,如上所示。

注意:會話可以將圖分發到多個機器上執行(假設程序在某個分佈式計算框架上運行)。有關詳情,請參閱分佈式 TensorFlow

總結

TensorFlow 編程本質上是一個兩步流程:

  1. 將常量、變量和指令整合到一個圖中。
  2. 在一個會話中評估這些常量、變量和指令。

    創建一個簡單的 TensorFlow 程序

我們來看看如何編寫一個將兩個常量相加的簡單 TensorFlow 程序。

添加 import 語句

與幾乎所有 Python 程序一樣,您首先要添加一些 import 語句。
當然,運行 TensorFlow 程序所需的 import 語句組合取決於您的程序將要訪問的功能。至少,您必須在所有 TensorFlow 程序中添加 import tensorflow 語句:

import tensorflow as tf
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

請勿忘記執行前面的代碼塊(import 語句)。

其他常見的 import 語句包括:

import matplotlib.pyplot as plt # 數據集可視化。
import numpy as np              # 低級數字 Python 庫。
import pandas as pd             # 較高級別的數字 Python 庫。

TensorFlow 提供了一個默認圖。不過,我們建議您明確創建自己的 Graph,以便跟蹤狀態(例如,您可能希望在每個單元格中使用一個不同的 Graph)。

import tensorflow as tf

# Create a graph.
g = tf.Graph()

# Establish the graph as the "default" graph.
with g.as_default():
  # Assemble a graph consisting of the following three operations:
  #   * Two tf.constant operations to create the operands.
  #   * One tf.add operation to add the two operands.
  x = tf.constant(8, name="x_const")
  y = tf.constant(5, name="y_const")
  sum = tf.add(x, y, name="x_y_sum")


  # Now create a session.
  # The session will run the default graph.
  with tf.Session() as sess:
    print sum.eval()
13
import tensorflow as tf

g = tf.Graph()

with g.as_default():
  x = tf.constant(100, name='x_const')
  y = tf.constant(566, name='y_const')
  sum = tf.add(x,y, name='x_y_sum')

  with tf.Session() as sess:
    print sum.eval()
666

In TensorFlow, what is the difference between Session.run() and Tensor.eval()?

If you have a Tensor t, calling t.eval() is equivalent to calling tf.get_default_session().run(t).

You can make a session the default as follows:

import tensorflow as tf

t = tf.constant(42.0)
sess = tf.Session()
with sess.as_default():   # or `with sess:` to close on exit
    assert sess is tf.get_default_session()
    assert t.eval() == sess.run(t)

練習:引入第三個運算數

修改上面的代碼列表,以將三個整數(而不是兩個)相加:

  1. 定義第三個標量整數常量 z,併爲其分配一個值 4
  2. sumz 相加,以得出一個新的和。

    提示:請參閱有關 tf.add() 的 API 文檔,瞭解有關其函數簽名的更多詳細信息。

  3. 重新運行修改後的代碼塊。該程序是否生成了正確的總和?

import tensorflow as tf

# Create a graph.
g = tf.Graph()

# Establish the graph as the "default" graph.
with g.as_default():
  # Assemble a graph consisting of the following three operations:
  #   * Two tf.constant operations to create the operands.
  #   * One tf.add operation to add the two operands.
  x = tf.constant(8, name="x_const")
  y = tf.constant(5, name="y_const")
  sum = tf.add(x, y, name="x_y_sum")
  z = tf.constant(4, name='z_const')


  # Now create a session.
  # The session will run the default graph.
  with tf.Session() as sess:
    print sess.run(tf.add(sum, z))
17

解決方案

點擊下方,查看解決方案。

# Create a graph.
g = tf.Graph()

# Establish our graph as the "default" graph.
with g.as_default():
  # Assemble a graph consisting of three operations. 
  # (Creating a tensor is an operation.)
  x = tf.constant(8, name="x_const")
  y = tf.constant(5, name="y_const")
  sum = tf.add(x, y, name="x_y_sum")

  # Task 1: Define a third scalar integer constant z.
  z = tf.constant(4, name="z_const")
  # Task 2: Add z to `sum` to yield a new sum.
  new_sum = tf.add(sum, z, name="x_y_z_sum")

  # Now create a session.
  # The session will run the default graph.
  with tf.Session() as sess:
    # Task 3: Ensure the program yields the correct grand total.
    print new_sum.eval()
17

更多信息

要進一步探索基本 TensorFlow 圖,請使用以下教程進行實驗:

創建和操控張量

學習目標:
* 初始化 TensorFlow 變量並賦值
* 創建和操控張量
* 回憶線性代數中的加法和乘法知識(如果這些內容對您來說很陌生,請參閱矩陣加法乘法簡介)
* 熟悉基本的 TensorFlow 數學和數組運算

import tensorflow as tf

矢量加法

您可以對張量執行很多典型數學運算 (TF API)。以下代碼會創建和操控兩個矢量(一維張量),每個矢量正好六個元素:

with tf.Graph().as_default():
  # Create a six-element vector (1-D tensor).
  primes = tf.constant([2, 3, 5, 7, 11, 13], dtype=tf.int32)

  # Create another six-element vector. Each element in the vector will be
  # initialized to 1. The first argument is the shape of the tensor (more
  # on shapes below).
  ones = tf.ones([6], dtype=tf.int32)

  # Add the two vectors. The resulting tensor is a six-element vector.
  just_beyond_primes = tf.add(primes, ones)

  # Create a session to run the default graph.
  with tf.Session() as sess:
    print just_beyond_primes.eval()
[ 3  4  6  8 12 14]
with tf.Graph().as_default():

  primes = tf.constant([2, 3, 5, 7, 11, 13], dtype=tf.int32)

  ones = tf.ones([6], dtype=tf.int32)

  just_beyond_primes = tf.add(primes, ones)

  with tf.Session() as sess:

    print just_beyond_primes.eval()
[ 3  4  6  8 12 14]

張量形狀

形狀用於描述張量維度的大小和數量。張量的形狀表示爲列表,其中第 i 個元素表示維度 i 的大小。列表的長度表示張量的階(即維數)。

有關詳情,請參閱 TensorFlow 文檔

以下是一些基本示例:

with tf.Graph().as_default():
  # A scalar (0-D tensor).
  scalar = tf.zeros([])

  # A vector with 3 elements.
  vector = tf.zeros([3])

  # A matrix with 2 rows and 3 columns.
  matrix = tf.zeros([2, 3])

  with tf.Session() as sess:
    print 'scalar has shape', scalar.get_shape(), 'and value:\n', scalar.eval()
    print 'vector has shape', vector.get_shape(), 'and value:\n', vector.eval()
    print 'matrix has shape', matrix.get_shape(), 'and value:\n', matrix.eval()
scalar has shape () and value:
0.0
vector has shape (3,) and value:
[0. 0. 0.]
matrix has shape (2, 3) and value:
[[0. 0. 0.]
 [0. 0. 0.]]

廣播

在數學中,您只能對形狀相同的張量執行元素級運算(例如,相加等於)。不過,在 TensorFlow 中,您可以對張量執行傳統意義上不可行的運算。TensorFlow 支持廣播(一種借鑑自 Numpy 的概念)。利用廣播,元素級運算中的較小數組會增大到與較大數組具有相同的形狀。例如,通過廣播:

  • 如果指令需要大小爲 [6] 的張量,則大小爲 [1][] 的張量可以作爲運算數。
  • 如果指令需要大小爲 [4, 6] 的張量,則以下任何大小的張量都可以作爲運算數。
    • [1, 6]
    • [6]
    • []
  • 如果指令需要大小爲 [3, 5, 6] 的張量,則以下任何大小的張量都可以作爲運算數。

    • [1, 5, 6]
    • [3, 1, 6]
    • [3, 5, 1]
    • [1, 1, 1]
    • [5, 6]
    • [1, 6]
    • [6]
    • [1]
    • []

注意:當張量被廣播時,從概念上來說,系統會複製其條目(出於性能考慮,實際並不複製。廣播專爲實現性能優化而設計)。

有關完整的廣播規則集,請參閱簡單易懂的 Numpy 廣播文檔

以下代碼執行了與之前一樣的張量加法,不過使用的是廣播:

with tf.Graph().as_default():
  # Create a six-element vector (1-D tensor).
  primes = tf.constant([2, 3, 5, 7, 11, 13], dtype=tf.int32)

  # Create a constant scalar with value 1.
  ones = tf.constant(1, dtype=tf.int32)

  # Add the two tensors. The resulting tensor is a six-element vector.
  just_beyond_primes = tf.add(primes, ones)

  with tf.Session() as sess:
    print just_beyond_primes.eval()
[ 3  4  6  8 12 14]

矩陣乘法

在線性代數中,當兩個矩陣相乘時,第一個矩陣的數必須等於第二個矩陣的數。

  • 3x4 矩陣乘以 4x2 矩陣是有效的,可以得出一個 3x2 矩陣。
  • 4x2 矩陣乘以 3x4 矩陣是無效的。
with tf.Graph().as_default():
  # Create a matrix (2-d tensor) with 3 rows and 4 columns.
  x = tf.constant([[5, 2, 4, 3], [5, 1, 6, -2], [-1, 3, -1, -2]],
                  dtype=tf.int32)

  # Create a matrix with 4 rows and 2 columns.
  y = tf.constant([[2, 2], [3, 5], [4, 5], [1, 6]], dtype=tf.int32)

  # Multiply `x` by `y`. 
  # The resulting matrix will have 3 rows and 2 columns.
  matrix_multiply_result = tf.matmul(x, y)

  with tf.Session() as sess:
    print matrix_multiply_result.eval()
[[35 58]
 [35 33]
 [ 1 -4]]

張量變形

由於張量加法和矩陣乘法均對運算數施加了限制條件,TensorFlow 編程者肯定會頻繁改變張量的形狀。

您可以使用 tf.reshape 方法改變張量的形狀。
例如,您可以將 8x2 張量變形爲 2x8 張量或 4x4 張量:

with tf.Graph().as_default():
  # Create an 8x2 matrix (2-D tensor).
  matrix = tf.constant([[1,2], [3,4], [5,6], [7,8],
                        [9,10], [11,12], [13, 14], [15,16]], dtype=tf.int32)

  # Reshape the 8x2 matrix into a 2x8 matrix.
  reshaped_2x8_matrix = tf.reshape(matrix, [2,8])

  # Reshape the 8x2 matrix into a 4x4 matrix
  reshaped_4x4_matrix = tf.reshape(matrix, [4,4])

  with tf.Session() as sess:
    print "Original matrix (8x2):"
    print matrix.eval()
    print "Reshaped matrix (2x8):"
    print reshaped_2x8_matrix.eval()
    print "Reshaped matrix (4x4):"
    print reshaped_4x4_matrix.eval()
Original matrix (8x2):
[[ 1  2]
 [ 3  4]
 [ 5  6]
 [ 7  8]
 [ 9 10]
 [11 12]
 [13 14]
 [15 16]]
Reshaped matrix (2x8):
[[ 1  2  3  4  5  6  7  8]
 [ 9 10 11 12 13 14 15 16]]
Reshaped matrix (4x4):
[[ 1  2  3  4]
 [ 5  6  7  8]
 [ 9 10 11 12]
 [13 14 15 16]]

此外,您還可以使用 tf.reshape 更改張量的維數(\’階\’)
例如,您可以將 8x2 張量變形爲三維 2x2x4 張量或一維 16 元素張量。

with tf.Graph().as_default():
  # Create an 8x2 matrix (2-D tensor).
  matrix = tf.constant([[1,2], [3,4], [5,6], [7,8],
                        [9,10], [11,12], [13, 14], [15,16]], dtype=tf.int32)

  # Reshape the 8x2 matrix into a 3-D 2x2x4 tensor.
  reshaped_2x2x4_tensor = tf.reshape(matrix, [2,2,4])

  # Reshape the 8x2 matrix into a 1-D 16-element tensor.
  one_dimensional_vector = tf.reshape(matrix, [16])

  with tf.Session() as sess:
    print "Original matrix (8x2):"
    print matrix.eval()
    print "Reshaped 3-D tensor (2x2x4):"
    print reshaped_2x2x4_tensor.eval()
    print "1-D vector:"
    print one_dimensional_vector.eval()
Original matrix (8x2):
[[ 1  2]
 [ 3  4]
 [ 5  6]
 [ 7  8]
 [ 9 10]
 [11 12]
 [13 14]
 [15 16]]
Reshaped 3-D tensor (2x2x4):
[[[ 1  2  3  4]
  [ 5  6  7  8]]

 [[ 9 10 11 12]
  [13 14 15 16]]]
1-D vector:
[ 1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16]

練習 1:改變兩個張量的形狀,使其能夠相乘。

下面兩個矢量無法進行矩陣乘法運算:

  • a = tf.constant([5, 3, 2, 7, 1, 4])
  • b = tf.constant([4, 6, 3])

請改變這兩個矢量的形狀,使其成爲可以進行矩陣乘法運算的運算數。
然後,對變形後的張量調用矩陣乘法運算。

  # Write your code for Task 1 here.
  import tensorflow as tf 

  a = tf.constant([5, 3, 2, 7, 1, 4])

  b = tf.constant([4, 6, 3])

  a_reshape1 = tf.reshape(a, [2,3])

  b_reshape1 = tf.reshape(b, [3,1])

  a_reshape2 = tf.reshape(a, [6,1])

  b_reshape2 = tf.reshape(b, [1,3])

  c = tf.matmul(a_reshape1, b_reshape1)

  d = tf.matmul(a_reshape2, b_reshape2)

  with tf.Session() as sess:
      print 'c.eval():\n', c.eval()

      print 'd.eval():\n', d.eval()
c.eval():
[[44]
 [46]]
d.eval():
[[20 30 15]
 [12 18  9]
 [ 8 12  6]
 [28 42 21]
 [ 4  6  3]
 [16 24 12]]

解決方案

點擊下方,查看解決方案。

with tf.Graph().as_default(), tf.Session() as sess:
  # Task: Reshape two tensors in order to multiply them

  # Here are the original operands, which are incompatible
  # for matrix multiplication:
  a = tf.constant([5, 3, 2, 7, 1, 4])
  b = tf.constant([4, 6, 3])
  # We need to reshape at least one of these operands so that
  # the number of columns in the first operand equals the number
  # of rows in the second operand.

  # Reshape vector "a" into a 2-D 2x3 matrix:
  reshaped_a = tf.reshape(a, [2,3])

  # Reshape vector "b" into a 2-D 3x1 matrix:
  reshaped_b = tf.reshape(b, [3,1])

  # The number of columns in the first matrix now equals
  # the number of rows in the second matrix. Therefore, you
  # can matrix mutiply the two operands.
  c = tf.matmul(reshaped_a, reshaped_b)
  print(c.eval())

  # An alternate approach: [6,1] x [1, 3] -> [6,3]
[[44]
 [46]]

變量、初始化和賦值

到目前爲止,我們執行的所有運算都是針對靜態值 (tf.constant) 進行的;調用 eval() 始終返回同一結果。在 TensorFlow 中可以定義 Variable 對象,它的值是可以更改的。

創建變量時,您可以明確設置一個初始值,也可以使用初始化程序(例如分佈):

g = tf.Graph()
with g.as_default():
  # Create a variable with the initial value 3.
  v = tf.Variable([3])

  # Create a variable of shape [1], with a random initial value,
  # sampled from a normal distribution with mean 1 and standard deviation 0.35.
  w = tf.Variable(tf.random_normal([1], mean=1.0, stddev=0.35))
g = tf.Graph()
with g.as_default():
  v = tf.Variable([3])
  w = tf.Variable(tf.random_normal([1], mean=1.0, stddev=0.35))

TensorFlow 的一個特性是變量初始化不是自動進行的。例如,以下代碼塊會導致錯誤:

with g.as_default():
  with tf.Session() as sess:
    try:
      v.eval()
    except tf.errors.FailedPreconditionError as e:
      print "Caught expected error: ", e
Caught expected error:  Attempting to use uninitialized value Variable
     [[Node: _retval_Variable_0_0 = _Retval[T=DT_INT32, index=0, _device="/job:localhost/replica:0/task:0/device:CPU:0"](Variable)]]

要初始化變量,最簡單的方式是調用 global_variables_initializer。請注意 Session.run() 的用法(與 eval() 的用法大致相同)。

with g.as_default():
  with tf.Session() as sess:
    initialization = tf.global_variables_initializer()
    sess.run(initialization)
    # Now, variables can be accessed normally, and have values assigned to them.
    print v.eval()
    print w.eval()
[3]
[0.43437064]
with g.as_default():
  with tf.Session() as sess:
    init = tf.global_variables_initializer()
    sess.run(init)
    print v.eval()
    print w.eval()
[3]
[1.1014321]

初始化後,變量的值保留在同一會話中(不過,當您啓動新會話時,需要重新初始化它們):

with g.as_default():
  with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    # These three prints will print the same value.
    print w.eval()
    print w.eval()
    print w.eval()
[0.78335255]
[0.78335255]
[0.78335255]

要更改變量的值,請使用 assign 指令。請注意,僅創建 assign 指令不會起到任何作用。和初始化一樣,您必須運行賦值指令才能更新變量值:

with g.as_default():
  with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    # This should print the variable's initial value.
    print v.eval()

    assignment = tf.assign(v, [7])
    # The variable has not been changed yet!
    print v.eval()

    # Execute the assignment op.
    sess.run(assignment)
    # Now the variable is updated.
    print v.eval()
[3]
[3]
[7]

with g.as_default():
  with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    print v.eval()

    assign = tf.assign(v, [888])
    sess.run(assign)
    print v.eval()
[3]
[888]

還有很多關於變量的內容我們並未在這裏提及,例如加載和存儲。要了解詳情,請參閱 TensorFlow 文檔

### 練習 2:模擬投擲兩個骰子 10 次。

創建一個骰子模擬,在模擬中生成一個 10x3 二維張量,其中:

  • 12 均存儲一個骰子的一次投擲值。
  • 3 存儲同一行中列 12 的值的總和。

例如,第一行中可能會包含以下值:

  • 1 存儲 4
  • 2 存儲 3
  • 3 存儲 7

要完成此任務,您需要瀏覽 TensorFlow 文檔

# Write your code for Task 2 here.

import tensorflow as tf 

with tf.Graph().as_default():
  with tf.Session() as sess:

    d1 = tf.Variable(tf.random_uniform([10, 1],minval=1, maxval=7,dtype=tf.int32))
    d2 = tf.Variable(tf.random_uniform([10, 1], minval=1, maxval=7,dtype=tf.int32))
    d3 = tf.add(d1, d2)

    result = tf.concat(values=[d1, d2, d3], axis=1)      
    sess.run(tf.global_variables_initializer())
    print(result.eval())
[[ 3  5  8]
 [ 3  6  9]
 [ 6  5 11]
 [ 4  2  6]
 [ 3  5  8]
 [ 4  2  6]
 [ 4  6 10]
 [ 1  6  7]
 [ 2  3  5]
 [ 2  4  6]]

解決方案

點擊下方,查看解決方案。

with tf.Graph().as_default(), tf.Session() as sess:
  # Task 2: Simulate 10 throws of two dice. Store the results
  # in a 10x3 matrix.

  # We're going to place dice throws inside two separate
  # 10x1 matrices. We could have placed dice throws inside
  # a single 10x2 matrix, but adding different columns of
  # the same matrix is tricky. We also could have placed
  # dice throws inside two 1-D tensors (vectors); doing so
  # would require transposing the result.
  dice1 = tf.Variable(tf.random_uniform([10, 1],
                                        minval=1, maxval=7,
                                        dtype=tf.int32))
  dice2 = tf.Variable(tf.random_uniform([10, 1],
                                        minval=1, maxval=7,
                                        dtype=tf.int32))

  # We may add dice1 and dice2 since they share the same shape
  # and size.
  dice_sum = tf.add(dice1, dice2)

  # We've got three separate 10x1 matrices. To produce a single
  # 10x3 matrix, we'll concatenate them along dimension 1.
  resulting_matrix = tf.concat(
      values=[dice1, dice2, dice_sum], axis=1)

  # The variables haven't been initialized within the graph yet,
  # so let's remedy that.
  sess.run(tf.global_variables_initializer())

  print(resulting_matrix.eval())
[[ 6  5 11]
 [ 6  2  8]
 [ 5  6 11]
 [ 2  3  5]
 [ 4  3  7]
 [ 3  4  7]
 [ 1  1  2]
 [ 5  4  9]
 [ 5  4  9]
 [ 5  5 10]]
發佈了186 篇原創文章 · 獲贊 44 · 訪問量 14萬+
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章