Tensorflow 之張量類型

常量類型

在 Tensorflow 中任何變量都需要轉換爲 Tensorflow 可以識別的類型，當然作爲變量的特殊形式（不可改變）常量需要使用 tf.constant 存儲或轉換爲張量類型 tf.Tensor ，當然可能不太貼切，所以Tensorflow 也創建了函數 tf.convert_to_tensor 實現了一樣的功能，以常見的類型爲例：

tf.constant(1) # 標量 shape 爲 []
tf.constant(1.2) # 標量 shape 爲 []
tf.constant(True) # 標量 shape 爲 []
tf.constant("Hello World !") # 標量 shape 爲 []
tf.constant([1,2,3,1,2,3]) # 向量 shape 爲 [1]
tf.constant([[1,2,3],[1,2,3]]) # 矩陣 shape 爲 [n,m]
tf.constant([[[1,2,3],[1,2,3]],[[1,2,3],[1,2,3]]]) # 張量 shape 爲 [n,m,k,...]

變量類型

相比於常量類型，變量類型 tf.Variable 支持梯度信息的記錄，其是一種待優化張量，在普通的張量類型基礎上添加了 name，trainable 等屬性來支持計算圖的構建。由於梯度運算會消耗大量的計算資源，而且會自動更新相關參數，對於需要計算梯度並優化的張量，如神經網絡層的 W 和 𝒃，需要通過 tf.Variable 包裹以便 TensorFlow 跟蹤相關梯度信息。

var = tf.Variable(tf.constant([-1, 0, 1, 2]))
var.name,var.trainable

其中 trainable 表徵當前張量是否需要被優化，創建 Variable 對象是默認啓用優化標誌，可以設置trainable=False 來設置張量不需要優化。

類型轉換

這裏所說的類型轉換，實際上是精度的改變（不適用於將數值類型轉換爲字符串類型），比如從 bool 型轉換爲整型：

var = tf.constant([True, False])
tf.cast(var , tf.int32)

張量數據類型（data type）可以通過 .dtype 屬性進行查看，同時 Tensorflow 也爲（待優化和普通）張量類型提供了 .numpy() 接口便於轉換爲 numpy 類型。

常用張量生成

創建統一填充的張量

tf.zeros(shape) # 創建形狀爲 shape 的張量, 全爲0的張量
tf.ones(shape) # 創建形狀爲 shape 的張量, 全爲1的張量
tf.fill(shape, value = -1) # 創建形狀爲 shape 的張量, 全爲value 的張量

創建形狀爲 shape 的張量，其數值服從正態分佈 $\mathcal { N } \left( \text {mean, stddev} ^ { 2 } \right)$ ，mean 爲均值，stddev 爲標準差：

tf.random.normal(shape, mean=0.0, stddev=1.0)

創建形狀爲 shape 的張量，其數值在區間 [minval,maxval] 區間上均勻分佈

tf.random.uniform(shape, minval=0, maxval=None, dtype=tf.float32)

創建 [start,limit) 之間，步長爲 delta 的遞增整形序列張量

tf.range(start, limit, delta=1)

索引與切片

當然在Tensorflow中，支持使用索引與切片操作提取張量的部分數據。在 TensorFlow 中，支持基本的[𝑗][𝑘]…標準索引方式，也支持通過逗號分隔索引號 [𝑗, 𝑘, …] 的索引方式。通過 [start : end : step] 的切片方式可以方便地提取一段數據，其中 start 爲開始讀取位置的索引，end 爲結束讀取位置的索引(不包含 end 位)，step 爲讀取步長。當然爲了避免 [:, :, :, n] 這樣的情況可以使用 ... 表示取多個維度
上所有的數據，即除卻以提供索引的維度的其他維度的所有數據。那麼 [:, :, :, n] 可以表示爲 […, n]。

形狀改變

通過 tf.reshape(x, new_shape)，可以對張量的視圖進行任意的合法改變

x=tf.range(96)
x=tf.reshape(x,[2,4,4,3])

增刪維度

增加維度：增加一個長度爲 1 的維度相當於給原有的數據增加一個新維度的概念，維度長度爲 1，故數據並不需要改變，僅僅是改變數據的理解方式，因此它其實可以理解爲改變視圖的一種特殊方式。通過 tf.expand_dims(x, axis) 可在指定的 axis 軸前可以插入一個新的維度:

x = tf.random.uniform([28,28],maxval=10,dtype=tf.int32)
x = tf.expand_dims(x,axis=2)

刪除維度：是增加維度的逆操作，與增加維度一樣，刪除維度只能刪除長度爲 1 的維度，也不會改變張量的存儲。如果希望將圖片數量維度刪除，可以通過 tf.squeeze(x, axis) 函數，axis 參數爲待刪除的維度的索引號。

x = tf.random.uniform([1,28,28,1],maxval=10,dtype=tf.int32)
tf.squeeze(x,axis=0)

當 axis = None 時，代表刪除任意長度爲 1 的維度。

交換維度

改變視圖、增刪維度都不會影響張量的存儲。在實現算法邏輯時，在保持維度順序不變的條件下，僅僅改變張量的理解方式是不夠的，有時需要直接調整的存儲順序，即交換維度(Transpose)。通過交換維度，改變了張量的存儲順序，同時也改變了張量的視圖。使用 tf.transpose(x, perm) 函數完成維度交換操作，其中 perm 表示新維度的順序 List。

x = tf.random.normal([2,32,32,3])
tf.transpose(x,perm=[0,2,1,3])

數據複製

通過 tf.tile(x, multiples) 函數可以指定數據在指定維度上的複製操作，multiples 爲每個維度上面的複製倍數的 List，對應位置爲 1 表明不復制，爲 n 表示複製爲原來的 n 倍，也就是該維度長度會變爲原來的 n 倍。

x = tf.range(4)
x = tf.reshape(x,[2,2])
x = tf.tile(x,multiples=[3,2])

不規則張量

一個張量的各個軸（維度）的元素個數不同叫做不規則（ragged），在 Tensorflow 中提出一種張量類型 tf.ragged.RaggedTensor 用於存儲不規則數據。

ragged_list = [
    [[0, 1, 2, 3]],
    [[4], [5]],
    [[6, 7], [8]],
    [[9, 10, 11]]]
ragged_tensor = tf.ragged.constant(ragged_list)
print(ragged_tensor)
print(ragged_tensor.shape)

結果如下，觀察可以看出該張量包含未知的維度即長度不統一：

<tf.RaggedTensor [[[0, 1, 2, 3]], [[4], [5]], [[6, 7], [8]], [[9, 10, 11]]]>
(4, None, None)

值得注意的是，tf.constant 是無法存儲這種類型的張量的，可以使用以下語句嘗試：

try:
  tensor = tf.constant(ragged_list)
except Exception as e:
  print(f"{type(e).__name__}: {e}")

可以獲得所拋出的異常：

ValueError: Can't convert non-rectangular Python sequence to Tensor.

字符串張量

本節詳細介紹一下字符串張量的一些特性。

字符串張量的聲明：

# Tensors can be strings, too here is a scalar string.
scalar_string_tensor = tf.constant("Gray wolf")
print(scalar_string_tensor)
# If we have two string tensors of different lengths, this is OK.
tensor_of_strings = tf.constant(["Gray wolf",
                                 "Quick brown fox",
                                 "Lazy dog"])
# Note that the shape is (2,), indicating that it is 2 x unknown.
print(tensor_of_strings)

輸出爲：

tf.Tensor(b'Gray wolf', shape=(), dtype=string)
tf.Tensor([b'Gray wolf' b'Quick brown fox' b'Lazy dog'], shape=(3,), dtype=string)

當輸入 unicode 字符時，將使用 utf-8 編碼格式存儲：

tf.constant("🥳👍")

輸出爲：

<tf.Tensor: shape=(), dtype=string, numpy=b'\xf0\x9f\xa5\xb3\xf0\x9f\x91\x8d'>

對於字符串張量可以使用字符串操作模塊 tf.strings , 其中就包括 tf.strings.split。

# We can use split to split a string into a set of tensors
# ...but it turns into a `RaggedTensor` if we split up a tensor of strings,
# as each string might be split into a different number of parts.
print(tf.strings.split(scalar_string_tensor, sep=" "))
print(tf.strings.split(tensor_of_strings, sep=" "))

輸出爲：

tf.Tensor([b'Gray' b'wolf'], shape=(2,), dtype=string)
<tf.RaggedTensor [[b'Gray', b'wolf'], [b'Quick', b'brown', b'fox'], [b'Lazy', b'dog']]>

對於分割後長度不規則情況，將生成不規則張量進行存儲：

同時字符串轉換數字 tf.string.to_number 也很常用：

text = tf.constant("1 10 100")
print(tf.strings.to_number(tf.strings.split(text, " ")))

輸出爲：

tf.Tensor([  1.  10. 100.], shape=(3,), dtype=float32)

雖然不能直接使用 tf.cast 直接將一個字符串張量轉換爲數字，但是可以將其轉換爲字節張量後再解碼爲數字張量.

byte_strings = tf.strings.bytes_split(tf.constant("Duck"))
byte_ints = tf.io.decode_raw(tf.constant("Duck"), tf.uint8)
print("Byte strings:", byte_strings)
print("Bytes:", byte_ints)

輸出爲：

Byte strings: tf.Tensor([b'D' b'u' b'c' b'k'], shape=(4,), dtype=string)
Bytes: tf.Tensor([ 68 117  99 107], shape=(4,), dtype=uint8)

當然也可以將其通過 Unicode 編碼格式解碼：

# Or split it up as unicode and then decode it
unicode_bytes = tf.constant("アヒル 🦆")
unicode_char_bytes = tf.strings.unicode_split(unicode_bytes, "UTF-8")
unicode_values = tf.strings.unicode_decode(unicode_bytes, "UTF-8")

print("Unicode bytes:", unicode_bytes)
print("\nUnicode chars:", unicode_char_bytes)
print("\nUnicode values:", unicode_values)

輸出爲：

Unicode bytes: tf.Tensor(b'\xe3\x82\xa2\xe3\x83\x92\xe3\x83\xab \xf0\x9f\xa6\x86', shape=(), dtype=string)

Unicode chars: tf.Tensor([b'\xe3\x82\xa2' b'\xe3\x83\x92' b'\xe3\x83\xab' b' ' b'\xf0\x9f\xa6\x86'], shape=(5,), dtype=string)

Unicode values: tf.Tensor([ 12450  12498  12523     32 129414], shape=(5,), dtype=int32)

稀疏張量

對於如下二維張量：

只有很少的元素不爲零。那麼可以使用以下語句進行存儲和使用稀疏張量：

# Sparse tensors store values by index in a memory-efficient manner
sparse_tensor = tf.sparse.SparseTensor(indices=[[0, 0],[0, 5], [1, 2]],
                                       values=[1, 2, 3],
                                       dense_shape=[3, 6])
print(sparse_tensor, "\n")

# We can convert sparse tensors to dense
print(tf.sparse.to_dense(sparse_tensor))

輸出爲：

SparseTensor(indices=tf.Tensor(
[[0 0]
 [0 5]
 [1 2]], shape=(3, 2), dtype=int64), values=tf.Tensor([1 2 3], shape=(3,), dtype=int32), dense_shape=tf.Tensor([3 6], shape=(2,), dtype=int64))

tf.Tensor(
[[1 0 0 0 0 2]
 [0 0 3 0 0 0]
 [0 0 0 0 0 0]], shape=(3, 6), dtype=int32)

Tensorflow 之張量類型

常量類型

變量類型

類型轉換

常用張量生成

索引與切片

形狀改變

增刪維度

交換維度

數據複製

不規則張量

字符串張量

稀疏張量

Window 安裝 Python 失敗 0x80070643，發生嚴重錯誤

《最新出爐》系列入門篇-Python+Playwright自動化測試-41-錄製視頻

多層感知器分類器的 Tensorflow 實現

Tensorflow 之張量操作

Tensorflow 之張量類型

Tensorflow 之 CPU計算效率和GPU計算效率對比

梯度提升機（Gradient Boosting Machine）之 LightGBM

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結

Tensorflow 之 張量類型

常量類型

變量類型

類型轉換

常用張量生成

索引與切片

形狀改變

增刪維度

交換維度

數據複製

不規則張量

字符串張量

稀疏張量

Tensorflow 之張量類型