Copyright 2017 Google LLC.
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
TensorFlow 編程概念
學習目標:
* 學習 TensorFlow 編程模型的基礎知識,重點了解以下概念:
* 張量
* 指令
* 圖
* 會話
* 構建一個簡單的 TensorFlow 程序,使用該程序繪製一個默認圖並創建一個運行該圖的會話
注意:請仔細閱讀本教程。TensorFlow 編程模型很可能與您遇到的其他模型不同,因此可能不如您期望的那樣直觀。
概念概覽
TensorFlow 的名稱源自張量,張量是任意維度的數組。藉助 TensorFlow,您可以操控具有大量維度的張量。即便如此,在大多數情況下,您會使用以下一個或多個低維張量:
- 標量是零維數組(零階張量)。例如,
\'Howdy\'
或5
- 矢量是一維數組(一階張量)。例如,
[2, 3, 5, 7, 11]
或[5]
- 矩陣是二維數組(二階張量)。例如,
[[3.1, 8.2, 5.9][4.3, -2.7, 6.5]]
TensorFlow 指令會創建、銷燬和操控張量。典型 TensorFlow 程序中的大多數代碼行都是指令。
TensorFlow 圖(也稱爲計算圖或數據流圖)是一種圖數據結構。很多 TensorFlow 程序由單個圖構成,但是 TensorFlow 程序可以選擇創建多個圖。圖的節點是指令;圖的邊是張量。張量流經圖,在每個節點由一個指令操控。一個指令的輸出張量通常會變成後續指令的輸入張量。TensorFlow 會實現延遲執行模型,意味着系統僅會根據相關節點的需求在需要時計算節點。
張量可以作爲常量或變量存儲在圖中。您可能已經猜到,常量存儲的是值不會發生更改的張量,而變量存儲的是值會發生更改的張量。不過,您可能沒有猜到的是,常量和變量都只是圖中的一種指令。常量是始終會返回同一張量值的指令。變量是會返回分配給它的任何張量的指令。
要定義常量,請使用 tf.constant
指令,並傳入它的值。例如:
x = tf.constant([5.2])
同樣,您可以創建如下變量:
y = tf.Variable([5])
或者,您也可以先創建變量,然後再如下所示地分配一個值(注意:您始終需要指定一個默認值):
y = tf.Variable([0])
y = y.assign([5])
定義一些常量或變量後,您可以將它們與其他指令(如 tf.add
)結合使用。在評估 tf.add
指令時,它會調用您的 tf.constant
或 tf.Variable
指令,以獲取它們的值,然後返回一個包含這些值之和的新張量。
圖必須在 TensorFlow 會話中運行,會話存儲了它所運行的圖的狀態:
將 tf.Session() 作爲會話:
initialization = tf.global_variables_initializer()
print y.eval()
在使用 tf.Variable
時,您必須在會話開始時調用 tf.global_variables_initializer
,以明確初始化這些變量,如上所示。
注意:會話可以將圖分發到多個機器上執行(假設程序在某個分佈式計算框架上運行)。有關詳情,請參閱分佈式 TensorFlow。
總結
TensorFlow 編程本質上是一個兩步流程:
- 將常量、變量和指令整合到一個圖中。
在一個會話中評估這些常量、變量和指令。
創建一個簡單的 TensorFlow 程序
我們來看看如何編寫一個將兩個常量相加的簡單 TensorFlow 程序。
添加 import 語句
與幾乎所有 Python 程序一樣,您首先要添加一些 import
語句。
當然,運行 TensorFlow 程序所需的 import
語句組合取決於您的程序將要訪問的功能。至少,您必須在所有 TensorFlow 程序中添加 import tensorflow
語句:
import tensorflow as tf
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
請勿忘記執行前面的代碼塊(import
語句)。
其他常見的 import 語句包括:
import matplotlib.pyplot as plt # 數據集可視化。
import numpy as np # 低級數字 Python 庫。
import pandas as pd # 較高級別的數字 Python 庫。
TensorFlow 提供了一個默認圖。不過,我們建議您明確創建自己的 Graph
,以便跟蹤狀態(例如,您可能希望在每個單元格中使用一個不同的 Graph
)。
import tensorflow as tf
# Create a graph.
g = tf.Graph()
# Establish the graph as the "default" graph.
with g.as_default():
# Assemble a graph consisting of the following three operations:
# * Two tf.constant operations to create the operands.
# * One tf.add operation to add the two operands.
x = tf.constant(8, name="x_const")
y = tf.constant(5, name="y_const")
sum = tf.add(x, y, name="x_y_sum")
# Now create a session.
# The session will run the default graph.
with tf.Session() as sess:
print sum.eval()
13
import tensorflow as tf
g = tf.Graph()
with g.as_default():
x = tf.constant(100, name='x_const')
y = tf.constant(566, name='y_const')
sum = tf.add(x,y, name='x_y_sum')
with tf.Session() as sess:
print sum.eval()
666
In TensorFlow, what is the difference between Session.run() and Tensor.eval()?
If you have a Tensor t, calling t.eval() is equivalent to calling tf.get_default_session().run(t).
You can make a session the default as follows:
import tensorflow as tf
t = tf.constant(42.0)
sess = tf.Session()
with sess.as_default(): # or `with sess:` to close on exit
assert sess is tf.get_default_session()
assert t.eval() == sess.run(t)
練習:引入第三個運算數
修改上面的代碼列表,以將三個整數(而不是兩個)相加:
- 定義第三個標量整數常量
z
,併爲其分配一個值4
。 將
sum
與z
相加,以得出一個新的和。提示:請參閱有關 tf.add() 的 API 文檔,瞭解有關其函數簽名的更多詳細信息。
重新運行修改後的代碼塊。該程序是否生成了正確的總和?
import tensorflow as tf
# Create a graph.
g = tf.Graph()
# Establish the graph as the "default" graph.
with g.as_default():
# Assemble a graph consisting of the following three operations:
# * Two tf.constant operations to create the operands.
# * One tf.add operation to add the two operands.
x = tf.constant(8, name="x_const")
y = tf.constant(5, name="y_const")
sum = tf.add(x, y, name="x_y_sum")
z = tf.constant(4, name='z_const')
# Now create a session.
# The session will run the default graph.
with tf.Session() as sess:
print sess.run(tf.add(sum, z))
17
解決方案
點擊下方,查看解決方案。
# Create a graph.
g = tf.Graph()
# Establish our graph as the "default" graph.
with g.as_default():
# Assemble a graph consisting of three operations.
# (Creating a tensor is an operation.)
x = tf.constant(8, name="x_const")
y = tf.constant(5, name="y_const")
sum = tf.add(x, y, name="x_y_sum")
# Task 1: Define a third scalar integer constant z.
z = tf.constant(4, name="z_const")
# Task 2: Add z to `sum` to yield a new sum.
new_sum = tf.add(sum, z, name="x_y_z_sum")
# Now create a session.
# The session will run the default graph.
with tf.Session() as sess:
# Task 3: Ensure the program yields the correct grand total.
print new_sum.eval()
17
更多信息
要進一步探索基本 TensorFlow 圖,請使用以下教程進行實驗:
創建和操控張量
學習目標:
* 初始化 TensorFlow 變量
並賦值
* 創建和操控張量
* 回憶線性代數中的加法和乘法知識(如果這些內容對您來說很陌生,請參閱矩陣加法和乘法簡介)
* 熟悉基本的 TensorFlow 數學和數組運算
import tensorflow as tf
矢量加法
您可以對張量執行很多典型數學運算 (TF API)。以下代碼會創建和操控兩個矢量(一維張量),每個矢量正好六個元素:
with tf.Graph().as_default():
# Create a six-element vector (1-D tensor).
primes = tf.constant([2, 3, 5, 7, 11, 13], dtype=tf.int32)
# Create another six-element vector. Each element in the vector will be
# initialized to 1. The first argument is the shape of the tensor (more
# on shapes below).
ones = tf.ones([6], dtype=tf.int32)
# Add the two vectors. The resulting tensor is a six-element vector.
just_beyond_primes = tf.add(primes, ones)
# Create a session to run the default graph.
with tf.Session() as sess:
print just_beyond_primes.eval()
[ 3 4 6 8 12 14]
with tf.Graph().as_default():
primes = tf.constant([2, 3, 5, 7, 11, 13], dtype=tf.int32)
ones = tf.ones([6], dtype=tf.int32)
just_beyond_primes = tf.add(primes, ones)
with tf.Session() as sess:
print just_beyond_primes.eval()
[ 3 4 6 8 12 14]
張量形狀
形狀用於描述張量維度的大小和數量。張量的形狀表示爲列表
,其中第 i
個元素表示維度 i
的大小。列表的長度表示張量的階(即維數)。
有關詳情,請參閱 TensorFlow 文檔。
以下是一些基本示例:
with tf.Graph().as_default():
# A scalar (0-D tensor).
scalar = tf.zeros([])
# A vector with 3 elements.
vector = tf.zeros([3])
# A matrix with 2 rows and 3 columns.
matrix = tf.zeros([2, 3])
with tf.Session() as sess:
print 'scalar has shape', scalar.get_shape(), 'and value:\n', scalar.eval()
print 'vector has shape', vector.get_shape(), 'and value:\n', vector.eval()
print 'matrix has shape', matrix.get_shape(), 'and value:\n', matrix.eval()
scalar has shape () and value:
0.0
vector has shape (3,) and value:
[0. 0. 0.]
matrix has shape (2, 3) and value:
[[0. 0. 0.]
[0. 0. 0.]]
廣播
在數學中,您只能對形狀相同的張量執行元素級運算(例如,相加和等於)。不過,在 TensorFlow 中,您可以對張量執行傳統意義上不可行的運算。TensorFlow 支持廣播(一種借鑑自 Numpy 的概念)。利用廣播,元素級運算中的較小數組會增大到與較大數組具有相同的形狀。例如,通過廣播:
- 如果指令需要大小爲
[6]
的張量,則大小爲[1]
或[]
的張量可以作爲運算數。 - 如果指令需要大小爲
[4, 6]
的張量,則以下任何大小的張量都可以作爲運算數。
[1, 6]
[6]
[]
如果指令需要大小爲
[3, 5, 6]
的張量,則以下任何大小的張量都可以作爲運算數。[1, 5, 6]
[3, 1, 6]
[3, 5, 1]
[1, 1, 1]
[5, 6]
[1, 6]
[6]
[1]
[]
注意:當張量被廣播時,從概念上來說,系統會複製其條目(出於性能考慮,實際並不複製。廣播專爲實現性能優化而設計)。
有關完整的廣播規則集,請參閱簡單易懂的 Numpy 廣播文檔。
以下代碼執行了與之前一樣的張量加法,不過使用的是廣播:
with tf.Graph().as_default():
# Create a six-element vector (1-D tensor).
primes = tf.constant([2, 3, 5, 7, 11, 13], dtype=tf.int32)
# Create a constant scalar with value 1.
ones = tf.constant(1, dtype=tf.int32)
# Add the two tensors. The resulting tensor is a six-element vector.
just_beyond_primes = tf.add(primes, ones)
with tf.Session() as sess:
print just_beyond_primes.eval()
[ 3 4 6 8 12 14]
矩陣乘法
在線性代數中,當兩個矩陣相乘時,第一個矩陣的列數必須等於第二個矩陣的行數。
3x4
矩陣乘以4x2
矩陣是有效的,可以得出一個3x2
矩陣。4x2
矩陣乘以3x4
矩陣是無效的。
with tf.Graph().as_default():
# Create a matrix (2-d tensor) with 3 rows and 4 columns.
x = tf.constant([[5, 2, 4, 3], [5, 1, 6, -2], [-1, 3, -1, -2]],
dtype=tf.int32)
# Create a matrix with 4 rows and 2 columns.
y = tf.constant([[2, 2], [3, 5], [4, 5], [1, 6]], dtype=tf.int32)
# Multiply `x` by `y`.
# The resulting matrix will have 3 rows and 2 columns.
matrix_multiply_result = tf.matmul(x, y)
with tf.Session() as sess:
print matrix_multiply_result.eval()
[[35 58]
[35 33]
[ 1 -4]]
張量變形
由於張量加法和矩陣乘法均對運算數施加了限制條件,TensorFlow 編程者肯定會頻繁改變張量的形狀。
您可以使用 tf.reshape
方法改變張量的形狀。
例如,您可以將 8x2 張量變形爲 2x8 張量或 4x4 張量:
with tf.Graph().as_default():
# Create an 8x2 matrix (2-D tensor).
matrix = tf.constant([[1,2], [3,4], [5,6], [7,8],
[9,10], [11,12], [13, 14], [15,16]], dtype=tf.int32)
# Reshape the 8x2 matrix into a 2x8 matrix.
reshaped_2x8_matrix = tf.reshape(matrix, [2,8])
# Reshape the 8x2 matrix into a 4x4 matrix
reshaped_4x4_matrix = tf.reshape(matrix, [4,4])
with tf.Session() as sess:
print "Original matrix (8x2):"
print matrix.eval()
print "Reshaped matrix (2x8):"
print reshaped_2x8_matrix.eval()
print "Reshaped matrix (4x4):"
print reshaped_4x4_matrix.eval()
Original matrix (8x2):
[[ 1 2]
[ 3 4]
[ 5 6]
[ 7 8]
[ 9 10]
[11 12]
[13 14]
[15 16]]
Reshaped matrix (2x8):
[[ 1 2 3 4 5 6 7 8]
[ 9 10 11 12 13 14 15 16]]
Reshaped matrix (4x4):
[[ 1 2 3 4]
[ 5 6 7 8]
[ 9 10 11 12]
[13 14 15 16]]
此外,您還可以使用 tf.reshape
更改張量的維數(\’階\’)。
例如,您可以將 8x2 張量變形爲三維 2x2x4 張量或一維 16 元素張量。
with tf.Graph().as_default():
# Create an 8x2 matrix (2-D tensor).
matrix = tf.constant([[1,2], [3,4], [5,6], [7,8],
[9,10], [11,12], [13, 14], [15,16]], dtype=tf.int32)
# Reshape the 8x2 matrix into a 3-D 2x2x4 tensor.
reshaped_2x2x4_tensor = tf.reshape(matrix, [2,2,4])
# Reshape the 8x2 matrix into a 1-D 16-element tensor.
one_dimensional_vector = tf.reshape(matrix, [16])
with tf.Session() as sess:
print "Original matrix (8x2):"
print matrix.eval()
print "Reshaped 3-D tensor (2x2x4):"
print reshaped_2x2x4_tensor.eval()
print "1-D vector:"
print one_dimensional_vector.eval()
Original matrix (8x2):
[[ 1 2]
[ 3 4]
[ 5 6]
[ 7 8]
[ 9 10]
[11 12]
[13 14]
[15 16]]
Reshaped 3-D tensor (2x2x4):
[[[ 1 2 3 4]
[ 5 6 7 8]]
[[ 9 10 11 12]
[13 14 15 16]]]
1-D vector:
[ 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16]
練習 1:改變兩個張量的形狀,使其能夠相乘。
下面兩個矢量無法進行矩陣乘法運算:
a = tf.constant([5, 3, 2, 7, 1, 4])
b = tf.constant([4, 6, 3])
請改變這兩個矢量的形狀,使其成爲可以進行矩陣乘法運算的運算數。
然後,對變形後的張量調用矩陣乘法運算。
# Write your code for Task 1 here.
import tensorflow as tf
a = tf.constant([5, 3, 2, 7, 1, 4])
b = tf.constant([4, 6, 3])
a_reshape1 = tf.reshape(a, [2,3])
b_reshape1 = tf.reshape(b, [3,1])
a_reshape2 = tf.reshape(a, [6,1])
b_reshape2 = tf.reshape(b, [1,3])
c = tf.matmul(a_reshape1, b_reshape1)
d = tf.matmul(a_reshape2, b_reshape2)
with tf.Session() as sess:
print 'c.eval():\n', c.eval()
print 'd.eval():\n', d.eval()
c.eval():
[[44]
[46]]
d.eval():
[[20 30 15]
[12 18 9]
[ 8 12 6]
[28 42 21]
[ 4 6 3]
[16 24 12]]
解決方案
點擊下方,查看解決方案。
with tf.Graph().as_default(), tf.Session() as sess:
# Task: Reshape two tensors in order to multiply them
# Here are the original operands, which are incompatible
# for matrix multiplication:
a = tf.constant([5, 3, 2, 7, 1, 4])
b = tf.constant([4, 6, 3])
# We need to reshape at least one of these operands so that
# the number of columns in the first operand equals the number
# of rows in the second operand.
# Reshape vector "a" into a 2-D 2x3 matrix:
reshaped_a = tf.reshape(a, [2,3])
# Reshape vector "b" into a 2-D 3x1 matrix:
reshaped_b = tf.reshape(b, [3,1])
# The number of columns in the first matrix now equals
# the number of rows in the second matrix. Therefore, you
# can matrix mutiply the two operands.
c = tf.matmul(reshaped_a, reshaped_b)
print(c.eval())
# An alternate approach: [6,1] x [1, 3] -> [6,3]
[[44]
[46]]
變量、初始化和賦值
到目前爲止,我們執行的所有運算都是針對靜態值 (tf.constant
) 進行的;調用 eval()
始終返回同一結果。在 TensorFlow 中可以定義 Variable
對象,它的值是可以更改的。
創建變量時,您可以明確設置一個初始值,也可以使用初始化程序(例如分佈):
g = tf.Graph()
with g.as_default():
# Create a variable with the initial value 3.
v = tf.Variable([3])
# Create a variable of shape [1], with a random initial value,
# sampled from a normal distribution with mean 1 and standard deviation 0.35.
w = tf.Variable(tf.random_normal([1], mean=1.0, stddev=0.35))
g = tf.Graph()
with g.as_default():
v = tf.Variable([3])
w = tf.Variable(tf.random_normal([1], mean=1.0, stddev=0.35))
TensorFlow 的一個特性是變量初始化不是自動進行的。例如,以下代碼塊會導致錯誤:
with g.as_default():
with tf.Session() as sess:
try:
v.eval()
except tf.errors.FailedPreconditionError as e:
print "Caught expected error: ", e
Caught expected error: Attempting to use uninitialized value Variable
[[Node: _retval_Variable_0_0 = _Retval[T=DT_INT32, index=0, _device="/job:localhost/replica:0/task:0/device:CPU:0"](Variable)]]
要初始化變量,最簡單的方式是調用 global_variables_initializer
。請注意 Session.run()
的用法(與 eval()
的用法大致相同)。
with g.as_default():
with tf.Session() as sess:
initialization = tf.global_variables_initializer()
sess.run(initialization)
# Now, variables can be accessed normally, and have values assigned to them.
print v.eval()
print w.eval()
[3]
[0.43437064]
with g.as_default():
with tf.Session() as sess:
init = tf.global_variables_initializer()
sess.run(init)
print v.eval()
print w.eval()
[3]
[1.1014321]
初始化後,變量的值保留在同一會話中(不過,當您啓動新會話時,需要重新初始化它們):
with g.as_default():
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
# These three prints will print the same value.
print w.eval()
print w.eval()
print w.eval()
[0.78335255]
[0.78335255]
[0.78335255]
要更改變量的值,請使用 assign
指令。請注意,僅創建 assign
指令不會起到任何作用。和初始化一樣,您必須運行
賦值指令才能更新變量值:
with g.as_default():
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
# This should print the variable's initial value.
print v.eval()
assignment = tf.assign(v, [7])
# The variable has not been changed yet!
print v.eval()
# Execute the assignment op.
sess.run(assignment)
# Now the variable is updated.
print v.eval()
[3]
[3]
[7]
with g.as_default():
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
print v.eval()
assign = tf.assign(v, [888])
sess.run(assign)
print v.eval()
[3]
[888]
還有很多關於變量的內容我們並未在這裏提及,例如加載和存儲。要了解詳情,請參閱 TensorFlow 文檔。
### 練習 2:模擬投擲兩個骰子 10 次。
創建一個骰子模擬,在模擬中生成一個 10x3
二維張量,其中:
- 列
1
和2
均存儲一個骰子的一次投擲值。 - 列
3
存儲同一行中列1
和2
的值的總和。
例如,第一行中可能會包含以下值:
- 列
1
存儲4
- 列
2
存儲3
- 列
3
存儲7
要完成此任務,您需要瀏覽 TensorFlow 文檔。
# Write your code for Task 2 here.
import tensorflow as tf
with tf.Graph().as_default():
with tf.Session() as sess:
d1 = tf.Variable(tf.random_uniform([10, 1],minval=1, maxval=7,dtype=tf.int32))
d2 = tf.Variable(tf.random_uniform([10, 1], minval=1, maxval=7,dtype=tf.int32))
d3 = tf.add(d1, d2)
result = tf.concat(values=[d1, d2, d3], axis=1)
sess.run(tf.global_variables_initializer())
print(result.eval())
[[ 3 5 8]
[ 3 6 9]
[ 6 5 11]
[ 4 2 6]
[ 3 5 8]
[ 4 2 6]
[ 4 6 10]
[ 1 6 7]
[ 2 3 5]
[ 2 4 6]]
解決方案
點擊下方,查看解決方案。
with tf.Graph().as_default(), tf.Session() as sess:
# Task 2: Simulate 10 throws of two dice. Store the results
# in a 10x3 matrix.
# We're going to place dice throws inside two separate
# 10x1 matrices. We could have placed dice throws inside
# a single 10x2 matrix, but adding different columns of
# the same matrix is tricky. We also could have placed
# dice throws inside two 1-D tensors (vectors); doing so
# would require transposing the result.
dice1 = tf.Variable(tf.random_uniform([10, 1],
minval=1, maxval=7,
dtype=tf.int32))
dice2 = tf.Variable(tf.random_uniform([10, 1],
minval=1, maxval=7,
dtype=tf.int32))
# We may add dice1 and dice2 since they share the same shape
# and size.
dice_sum = tf.add(dice1, dice2)
# We've got three separate 10x1 matrices. To produce a single
# 10x3 matrix, we'll concatenate them along dimension 1.
resulting_matrix = tf.concat(
values=[dice1, dice2, dice_sum], axis=1)
# The variables haven't been initialized within the graph yet,
# so let's remedy that.
sess.run(tf.global_variables_initializer())
print(resulting_matrix.eval())
[[ 6 5 11]
[ 6 2 8]
[ 5 6 11]
[ 2 3 5]
[ 4 3 7]
[ 3 4 7]
[ 1 1 2]
[ 5 4 9]
[ 5 4 9]
[ 5 5 10]]