我們先從一個簡單的例子來看看PyTorch和Tensorflow的區別
import torch import tensorflow as tf if __name__ == "__main__": A = torch.Tensor([0]) B = torch.Tensor([10]) while (A < B)[0]: A += 2 B += 1 print(A, B) C = tf.constant([0]) D = tf.constant([10]) while (C < D)[0]: C = tf.add(C, 2) D = tf.add(D, 1) print(C, D)
運行結果
tensor([20.]) tensor([20.])
tf.Tensor([20], shape=(1,), dtype=int32) tf.Tensor([20], shape=(1,), dtype=int32)
這裏我們可以看到PyTorch更簡潔,不需要那麼多的接口API,更接近於Python編程本身。
張量
import torch if __name__ == "__main__": a = torch.Tensor([[1, 2], [3, 4]]) print(a) print(a.shape) print(a.type()) b = torch.Tensor(2, 3) print(b) print(b.type()) c = torch.ones(2, 2) print(c) print(c.type()) d = torch.zeros(2, 2) print(d) print(d.type()) # 定義一個對角矩陣 e = torch.eye(2, 2) print(e) print(e.type()) # 定義一個和b相同形狀的全0矩陣 f = torch.zeros_like(b) print(f) # 定義一個和b相同形狀的全1矩陣 g = torch.ones_like(b) print(g) # 定義一個序列 h = torch.arange(0, 11, 1) print(h) print(h.type()) # 獲取2~10之間等間隔的4個值 i = torch.linspace(2, 10, 4) print(i)
運行結果
tensor([[1., 2.],
[3., 4.]])
torch.Size([2, 2])
torch.FloatTensor
tensor([[7.0976e+22, 4.1828e+09, 4.2320e+21],
[1.1818e+22, 7.0976e+22, 1.8515e+28]])
torch.FloatTensor
tensor([[1., 1.],
[1., 1.]])
torch.FloatTensor
tensor([[0., 0.],
[0., 0.]])
torch.FloatTensor
tensor([[1., 0.],
[0., 1.]])
torch.FloatTensor
tensor([[0., 0., 0.],
[0., 0., 0.]])
tensor([[1., 1., 1.],
[1., 1., 1.]])
tensor([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
torch.LongTensor
tensor([ 2.0000, 4.6667, 7.3333, 10.0000])
從以上結果可以看到,PyTorch不僅可以直接定義具體的值還可以直接定義形狀,而直接定義形狀會得到一個隨機初始化的值。然後就是全0、全1、對角矩陣以及相同形狀的全0矩陣、相同形狀的全1矩陣,連續序列,這些tensorflow裏面都有。但也有一些跟tensorflow不同的部分
# 生成0~1之間的隨機值矩陣 j = torch.rand(2, 2) print(j) # 生成一個正態分佈,包含均值和標準差 k = torch.normal(mean=0.0, std=torch.rand(5)) print(k) # 生成一個均勻分佈 l = torch.Tensor(2, 2).uniform_(-1, 1) print(l) # 生成一個亂序序列 m = torch.randperm(10) print(m) print(m.type())
運行結果
tensor([[0.6257, 0.5132],
[0.5961, 0.8336]])
tensor([ 0.4738, 0.3879, 0.0394, -0.3446, 0.4863])
tensor([[ 0.6498, -0.8387],
[ 0.3767, -0.9012]])
tensor([9, 6, 1, 3, 0, 2, 7, 5, 8, 4])
torch.LongTensor
這裏我們需要注意的是torch.arange(0, 11, 1)以及torch.randperm(10)的結果類型都是torch.LongTensor,其他都是torch.FloatTensor的。
Tensor的屬性
每一個Tensor有torch.dtype、torch.device、torch.layout三種屬性。torch.dtype就是數據的類型,這個很容易理解。
- torch.device標識了torch.Tensor對象在創建之後所存儲的設備名稱——比如CPU,GPU。對於GPU一般是通過cuda來表示,如果我們的電腦有多塊GPU的話,則使用cuda:0、cuda:1、cuda:2這樣的方式來表示。
torch.layout表示torch.Tensor內存佈局的對象。我們可以將其定義爲稠密的張量,也就是我們日常使用的張量,如果不做特殊的定義的話,就是稠密的張量。換句話說就是它對應到內存中的一塊連續的區域。還有一種方式就是可以採用稀疏的方式來對張量進行存儲,而採用稀疏的方式存儲的是Tensor中非0元素的座標。對於稠密的張量的完整定義如下
a = torch.tensor([1, 2, 3], dtype=torch.float32, device=torch.device('cpu')) print(a)
運行結果
tensor([1., 2., 3.])
如果是定義在GPU上,則定義如下
a = torch.tensor([1, 2, 3], dtype=torch.float32, device=torch.device('cuda:0'))
如果只有一塊顯卡可以選擇默認的顯卡
a = torch.tensor([1, 2, 3], dtype=torch.float32, device=torch.device('cuda'))
而稀疏表達了當前Tensor中非0元素的個數,非0元素的個數越多,我們當前的數據越稀疏,如果全部是0的話就是非稀疏的情況了。低秩描述了數據本身之間的關聯性,秩表示了當前矩陣中的向量之間的一個線性可表示的關係,具體可以參考線性代數整理(二) 中矩陣的秩。通過稀疏可以使我們的模型變的非常的簡單。機器學習中包含有參數的模型和無參數的模型,對於有參數的模型,如果這些參數中0的個數非常多,就意味着可以把模型進行簡化。參數爲0的項實際上是可以消掉的。此時參數的個數是降低了,模型變得更簡單了。對於參數稀疏化的學習中,對於機器學習中是一個非常重要的性質。這是我們從機器學習的角度去介紹稀疏的意義。另外,我們通過對數據稀疏化的表示的時候,可以在內存中減少開銷。假設有一個100*100的矩陣,其中只有一個非0的值,其他9999個值全部是0,此時如果我們在內存中採用稠密的張量進行存儲的時候,此時我們就需要10000個內存單元來進行存儲;如果我們採用稀疏的張量進行存儲,此時我們只需要記錄這個非0元素的座標就可以了,所以我們在定義稀疏張量的時候需要給出座標值和原值,定義如下
# 定義稀疏張量的座標值 indices = torch.tensor([[0, 1, 1], [2, 0, 2]]) # 定義稀疏張量的原值 values = torch.tensor([3, 4, 5], dtype=torch.float32) # 定義一個稀疏張量 a = torch.sparse_coo_tensor(indices, values, [2, 4]) print(a) # 轉成稠密張量 print(a.to_dense())
運行結果
tensor(indices=tensor([[0, 1, 1],
[2, 0, 2]]),
values=tensor([3., 4., 5.]),
size=(2, 4), nnz=3, layout=torch.sparse_coo)
tensor([[0., 0., 3., 0.],
[4., 0., 5., 0.]])
轉成稠密張量以後,我們可以發現,經過稀疏張量的定義,它是在(0,2)、(1,0)、(1,2)的位置上放置了3、4、5三個非0值,而其他地方的值都爲0。]
我們可以再來定義一個對角矩陣的稀疏張量
# 定義稀疏張量的座標值 indices = torch.tensor([[0, 1, 2, 3], [0, 1, 2, 3]]) # 定義稀疏張量的原值 values = torch.tensor([3, 4, 5, 6], dtype=torch.float32) # 定義一個稀疏張量 a = torch.sparse_coo_tensor(indices, values, [4, 4]).to_dense() print(a)
運行結果
tensor([[3., 0., 0., 0.],
[0., 4., 0., 0.],
[0., 0., 5., 0.],
[0., 0., 0., 6.]])
爲什麼我們有的數據放在CPU上,有的放在GPU上,在圖像處理的時候,我們需要對其中設計的數據進行合理的分配,比如我們在進行數據讀取和數據預處理的時候,可能會優先放在CPU上來進行操作。對於參數的計算,進行推理和反向傳播的過程,通常會放在GPU上進行運算。通過對資源合理的分配來實現對資源利用率的最大化,保證網絡訓練和迭代的過程變得更加的快。
Tensor的算數運算
- 加法
import torch if __name__ == "__main__": a = torch.tensor([1], dtype=torch.int32) b = torch.tensor([2], dtype=torch.int32) c = a + b print(c) c = torch.add(a, b) print(c) c = a.add(b) print(c) a.add_(b) print(a)
運行結果
tensor([3], dtype=torch.int32)
tensor([3], dtype=torch.int32)
tensor([3], dtype=torch.int32)
tensor([3], dtype=torch.int32)
PyTorch有四種加法運算,前三種效果是一樣的,第四種會直接把a+b的值賦值給a。
但也有可能是一個向量或者是一個矩陣加上一個標量,則爲這個向量或者矩陣所有的分量全部加上這個標量
import torch if __name__ == "__main__": a = torch.tensor([[1, 2, 3], [4, 5, 6]], dtype=torch.int32) b = torch.tensor([2], dtype=torch.int32) c = a + b print(c) c = torch.add(a, b) print(c) c = a.add(b) print(c) a.add_(b) print(a)
運行結果
tensor([[3, 4, 5],
[6, 7, 8]], dtype=torch.int32)
tensor([[3, 4, 5],
[6, 7, 8]], dtype=torch.int32)
tensor([[3, 4, 5],
[6, 7, 8]], dtype=torch.int32)
tensor([[3, 4, 5],
[6, 7, 8]], dtype=torch.int32)
如果兩者都是向量或者矩陣,則必須兩者的最後一個維度的長度相同,否者會報錯
import torch if __name__ == "__main__": a = torch.tensor([[1, 2, 3], [4, 5, 6]], dtype=torch.int32) b = torch.tensor([2, 4, 6], dtype=torch.int32) c = a + b print(c) c = torch.add(a, b) print(c) c = a.add(b) print(c) a.add_(b) print(a)
運行結果
tensor([[ 3, 6, 9],
[ 6, 9, 12]], dtype=torch.int32)
tensor([[ 3, 6, 9],
[ 6, 9, 12]], dtype=torch.int32)
tensor([[ 3, 6, 9],
[ 6, 9, 12]], dtype=torch.int32)
tensor([[ 3, 6, 9],
[ 6, 9, 12]], dtype=torch.int32)
- 減法
import torch if __name__ == "__main__": a = torch.tensor([1], dtype=torch.int32) b = torch.tensor([2], dtype=torch.int32) c = b - a print(c) c = torch.sub(b, a) print(c) c = b.sub(a) print(c) b.sub_(a) print(b)
運行結果
tensor([1], dtype=torch.int32)
tensor([1], dtype=torch.int32)
tensor([1], dtype=torch.int32)
tensor([1], dtype=torch.int32)
減法的其他規則與加法相同。
- 乘法
哈達瑪積(element wise,對應元素相乘)
import torch if __name__ == "__main__": a = torch.tensor([[1, 2, 3], [4, 5, 6]], dtype=torch.int32) b = torch.tensor([2, 4, 6], dtype=torch.int32) c = a * b print(c) c = torch.mul(a, b) print(c) c = a.mul(b) print(c) a.mul_(b) print(a)
運行結果
tensor([[ 2, 8, 18],
[ 8, 20, 36]], dtype=torch.int32)
tensor([[ 2, 8, 18],
[ 8, 20, 36]], dtype=torch.int32)
tensor([[ 2, 8, 18],
[ 8, 20, 36]], dtype=torch.int32)
tensor([[ 2, 8, 18],
[ 8, 20, 36]], dtype=torch.int32)
- 除法
import torch if __name__ == "__main__": a = torch.tensor([[1, 2, 3], [4, 5, 6]], dtype=torch.float32) b = torch.tensor([2, 4, 6], dtype=torch.int32) c = a / b print(c) c = torch.div(a, b) print(c) c = a.div(b) print(c) a.div_(b) print(a)
運行結果
tensor([[0.5000, 0.5000, 0.5000],
[2.0000, 1.2500, 1.0000]])
tensor([[0.5000, 0.5000, 0.5000],
[2.0000, 1.2500, 1.0000]])
tensor([[0.5000, 0.5000, 0.5000],
[2.0000, 1.2500, 1.0000]])
tensor([[0.5000, 0.5000, 0.5000],
[2.0000, 1.2500, 1.0000]])
- 矩陣運算
二維矩陣乘法運算
import torch if __name__ == "__main__": a = torch.Tensor([[1, 2, 3], [4, 5, 6]]) b = torch.Tensor([[2, 4], [11, 13], [7, 9]]) c = a @ b print(c) c = torch.mm(a, b) print(c) c = torch.matmul(a, b) print(c) c = a.matmul(b) print(c) c = a.mm(b) print(c)
運行結果
tensor([[ 45., 57.],
[105., 135.]])
tensor([[ 45., 57.],
[105., 135.]])
tensor([[ 45., 57.],
[105., 135.]])
tensor([[ 45., 57.],
[105., 135.]])
tensor([[ 45., 57.],
[105., 135.]])
關於矩陣乘法的數學意義請參考線性代數整理 中矩陣和矩陣的乘法
對於高維的Tensor(dim>2),定義其矩陣乘法僅在最後的兩個維度上,要求前面的維度必須保持一致,就像矩陣的索引一樣並且運算操作只有torch.matmul()。
import torch if __name__ == "__main__": a = torch.ones(1, 2, 3, 4) b = torch.ones(1, 2, 4, 3) c = torch.matmul(a, b) print(c) c = a.matmul(b) print(c)
運行結果
tensor([[[[4., 4., 4.],
[4., 4., 4.],
[4., 4., 4.]],
[[4., 4., 4.],
[4., 4., 4.],
[4., 4., 4.]]]])
tensor([[[[4., 4., 4.],
[4., 4., 4.],
[4., 4., 4.]],
[[4., 4., 4.],
[4., 4., 4.],
[4., 4., 4.]]]])
本次運算實際上是作用在a的3,4和b的4,3這兩個維度上的。而對於前兩維需要保持一致,這裏都是1,2。
- 冪運算
import torch if __name__ == "__main__": a = torch.Tensor([1, 2, 3]) c = torch.pow(a, 2) print(c) c = a**2 print(c) a.pow_(2) print(a) # 計算e的a次方 a = torch.Tensor([2]) c = torch.exp(a) print(c) c = a.exp_() print(c)
運行結果
tensor([1., 4., 9.])
tensor([1., 4., 9.])
tensor([1., 4., 9.])
tensor([7.3891])
tensor([7.3891])
- 開方運算
import torch if __name__ == "__main__": a = torch.Tensor([1, 2, 3]) c = torch.sqrt(a) print(c) c = a.sqrt() print(c) a.sqrt_() print(a)
運行結果
tensor([1.0000, 1.4142, 1.7321])
tensor([1.0000, 1.4142, 1.7321])
tensor([1.0000, 1.4142, 1.7321])
- 對數運算
import torch if __name__ == "__main__": a = torch.Tensor([1, 2, 3]) c = torch.log2(a) print(c) c = torch.log10(a) print(c) c = torch.log(a) print(c) torch.log_(a) print(a)
運行結果
tensor([0.0000, 1.0000, 1.5850])
tensor([0.0000, 0.3010, 0.4771])
tensor([0.0000, 0.6931, 1.0986])
tensor([0.0000, 0.6931, 1.0986])
in-place概念和廣播機制
- in-place是指在進行運算的過程中,不允許使用臨時變量,也就是所謂的"就地"操作,也稱爲原位操作。
import torch if __name__ == "__main__": a = torch.Tensor([1, 2, 3]) b = torch.Tensor([4]) a = a + b print(a)
運行結果
tensor([5., 6., 7.])
又比如之前說的add_、sub_、mul_等等都屬於in-place。
- 廣播機制:張量參數可以自動擴展爲相同大小。需要滿足兩個條件:
- 每個張量至少有一個維度
- 滿足右對齊
import torch if __name__ == "__main__": a = torch.rand(2, 1, 1) print(a) b = torch.rand(3) print(b) c = a + b print(c) print(c.shape)
運行結果
tensor([[[0.9496]],
[[0.5661]]])
tensor([0.0402, 0.8962, 0.5040])
tensor([[[0.9898, 1.8458, 1.4536]],
[[0.6063, 1.4623, 1.0701]]])
torch.Size([2, 1, 3])
在上面這個例子中,我們可以看到a的最低一維是1,但是它有高維度,而b是一個三維向量,沒有高維度。通過結果c的形狀,我們可以發現c的高維度與a保持一致,而最後一個維度與b保持一致。也就是說最後一維的長度,兩個張量要麼長度相等,要麼其中有一個爲1,就可以運算,否則就會報錯。這就是廣播機制。
import torch if __name__ == "__main__": a = torch.rand(2, 4, 1, 3) print(a) b = torch.rand(4, 2, 3) print(b) c = a + b print(c) print(c.shape)
運行結果
tensor([[[[0.1862, 0.8673, 0.2926]],
[[0.6385, 0.6885, 0.8268]],
[[0.3837, 0.3433, 0.0975]],
[[0.4689, 0.4580, 0.4023]]],
[[[0.1647, 0.5968, 0.5279]],
[[0.8252, 0.7446, 0.1916]],
[[0.9649, 0.6015, 0.5151]],
[[0.7504, 0.8202, 0.7865]]]])
tensor([[[0.7811, 0.8357, 0.2585],
[0.8866, 0.3935, 0.4450]],
[[0.2543, 0.7985, 0.1959],
[0.5357, 0.3883, 0.4426]],
[[0.8317, 0.2597, 0.9586],
[0.2829, 0.8665, 0.2853]],
[[0.7220, 0.7107, 0.9395],
[0.8345, 0.0955, 0.3690]]])
tensor([[[[0.9673, 1.7029, 0.5510],
[1.0727, 1.2608, 0.7375]],
[[0.8928, 1.4871, 1.0227],
[1.1742, 1.0768, 1.2694]],
[[1.2154, 0.6029, 1.0561],
[0.6666, 1.2098, 0.3828]],
[[1.1909, 1.1687, 1.3418],
[1.3034, 0.5535, 0.7713]]],
[[[0.9458, 1.4325, 0.7864],
[1.0512, 0.9904, 0.9729]],
[[1.0795, 1.5432, 0.3876],
[1.3609, 1.1329, 0.6342]],
[[1.7966, 0.8611, 1.4737],
[1.2478, 1.4680, 0.8004]],
[[1.4724, 1.5309, 1.7260],
[1.5849, 0.9157, 1.1555]]]])
torch.Size([2, 4, 2, 3])
取整/取餘運算
import torch if __name__ == "__main__": a = torch.rand(2, 2) a.mul_(10) print(a) # 向下取整 print(torch.floor(a)) # 向上取整 print(torch.ceil(a)) # 四捨五入 print(torch.round(a)) # 取整數部分 print(torch.trunc(a)) # 取小數部分 print(torch.frac(a)) # 取餘 print(a % 2)
運行結果
tensor([[5.8996, 9.2745],
[1.0162, 8.2628]])
tensor([[5., 9.],
[1., 8.]])
tensor([[ 6., 10.],
[ 2., 9.]])
tensor([[6., 9.],
[1., 8.]])
tensor([[5., 9.],
[1., 8.]])
tensor([[0.8996, 0.2745],
[0.0162, 0.2628]])
tensor([[1.8996, 1.2745],
[1.0162, 0.2628]])
比較運算
import torch import numpy as np if __name__ == "__main__": a = torch.Tensor([[1, 2, 3], [4, 5, 6]]) b = torch.Tensor([[1, 4, 9], [6, 5, 7]]) c = torch.rand(2, 4) d = a print(a) print(b) # 比較張量中的每一個值是否相等(張量的形狀必須相同),返回相同形狀的布爾值 print(torch.eq(a, b)) print(torch.eq(a, d)) # 比較張量的形狀和值是否都相同,比較的兩個張量的形狀可以不同,但會返回False print(torch.equal(a, b)) print(torch.equal(a, c)) # 比較第一個張量中的每一個值是否大於等於第二個張量的相同位置上的值 # (張量的形狀必須相同),返回相同形狀的布爾值 print(torch.ge(a, b)) # 比較第一個張量中的每一個值是否大於第二個張量的相同位置上的值 # (張量的形狀必須相同),返回相同形狀的布爾值 print(torch.gt(a, b)) # 比較第一個張量中的每一個值是否小於等於第二個張量的相同位置上的值 # (張量的形狀必須相同),返回相同形狀的布爾值 print(torch.le(a, b)) # 比較第一個張量中的每一個值是否小於第二個張量的相同位置上的值 # (張量的形狀必須相同),返回相同形狀的布爾值 print(torch.lt(a, b)) # 比較第一個張量中的每一個值是否不等於第二個張量的相同位置上的值 # (張量的形狀必須相同),返回相同形狀的布爾值 print(torch.ne(a, b))
運行結果
tensor([[1., 2., 3.],
[4., 5., 6.]])
tensor([[1., 4., 9.],
[6., 5., 7.]])
tensor([[ True, False, False],
[False, True, False]])
tensor([[True, True, True],
[True, True, True]])
False
False
tensor([[ True, False, False],
[False, True, False]])
tensor([[False, False, False],
[False, False, False]])
tensor([[True, True, True],
[True, True, True]])
tensor([[False, True, True],
[ True, False, True]])
tensor([[False, True, True],
[ True, False, True]])
排序
a = torch.Tensor([1, 4, 4, 3, 5]) # 升序排序 print(torch.sort(a)) # 降序排序 print(torch.sort(a, descending=True)) b = torch.Tensor([[1, 4, 4, 3, 5], [2, 3, 1, 3, 5]]) print(b.shape) # 升序排序 print(torch.sort(b)) # 在第一個維度進行升序排序 print(torch.sort(b, dim=0)) # 降序排序 print(torch.sort(b, descending=True)) # 在第一個維度進行降序排序 print(torch.sort(b, dim=0, descending=True))
運行結果
torch.return_types.sort(
values=tensor([1., 3., 4., 4., 5.]),
indices=tensor([0, 3, 1, 2, 4]))
torch.return_types.sort(
values=tensor([5., 4., 4., 3., 1.]),
indices=tensor([4, 1, 2, 3, 0]))
torch.Size([2, 5])
torch.return_types.sort(
values=tensor([[1., 3., 4., 4., 5.],
[1., 2., 3., 3., 5.]]),
indices=tensor([[0, 3, 1, 2, 4],
[2, 0, 1, 3, 4]]))
torch.return_types.sort(
values=tensor([[1., 3., 1., 3., 5.],
[2., 4., 4., 3., 5.]]),
indices=tensor([[0, 1, 1, 0, 0],
[1, 0, 0, 1, 1]]))
torch.return_types.sort(
values=tensor([[5., 4., 4., 3., 1.],
[5., 3., 3., 2., 1.]]),
indices=tensor([[4, 1, 2, 3, 0],
[4, 1, 3, 0, 2]]))
torch.return_types.sort(
values=tensor([[2., 4., 4., 3., 5.],
[1., 3., 1., 3., 5.]]),
indices=tensor([[1, 0, 0, 0, 0],
[0, 1, 1, 1, 1]]))
Top K
a = torch.Tensor([[1, 4, 4, 3, 5], [2, 3, 1, 3, 6]]) # 在第一個維度上找出1個最大值 print(torch.topk(a, k=1, dim=0)) # 在第一個維度上找出2個最大值,由於a在第一個維度上只有2行,所以k最大隻能爲2 print(torch.topk(a, k=2, dim=0)) # 在第二個維度上找出2個最大值,這裏k最大可爲5,因爲第二個維度上有5個值 print(torch.topk(a, k=2, dim=1))
運行結果
torch.return_types.topk(
values=tensor([[2., 4., 4., 3., 6.]]),
indices=tensor([[1, 0, 0, 0, 1]]))
torch.return_types.topk(
values=tensor([[2., 4., 4., 3., 6.],
[1., 3., 1., 3., 5.]]),
indices=tensor([[1, 0, 0, 0, 1],
[0, 1, 1, 1, 0]]))
torch.return_types.topk(
values=tensor([[5., 4.],
[6., 3.]]),
indices=tensor([[4, 1],
[4, 1]]))
第k個最小值
a = torch.Tensor([[1, 4, 4, 3, 5], [2, 3, 1, 3, 6], [4, 5, 6, 7, 8]]) # 在第一個維度上找出第2小的數 print(torch.kthvalue(a, k=2, dim=0)) # 在第二個維度上找出第2小的數 print(torch.kthvalue(a, k=2, dim=1))
運行結果
torch.return_types.kthvalue(
values=tensor([2., 4., 4., 3., 6.]),
indices=tensor([1, 0, 0, 0, 1]))
torch.return_types.kthvalue(
values=tensor([3., 2., 5.]),
indices=tensor([3, 0, 1]))
數據合法性校驗
a = torch.rand(2, 3) b = torch.Tensor([1, 2, np.nan]) print(a) # 是否有界 print(torch.isfinite(a)) print(torch.isfinite(a / 0)) # 是否無界 print(torch.isinf(a / 0)) # 是否爲空 print(torch.isnan(a)) print(torch.isnan(b))
運行結果
tensor([[0.8657, 0.4002, 0.3988],
[0.2066, 0.5564, 0.3181]])
tensor([[True, True, True],
[True, True, True]])
tensor([[False, False, False],
[False, False, False]])
tensor([[True, True, True],
[True, True, True]])
tensor([[False, False, False],
[False, False, False]])
tensor([False, False, True])
三角函數
import torch if __name__ == '__main__': a = torch.Tensor([0, 0, 0]) print(torch.cos(a))
運行結果
tensor([1., 1., 1.])
統計學函數
import torch if __name__ == '__main__': a = torch.rand(2, 2) print(a) # 求均值 print(torch.mean(a)) # 對第一個維度求均值 print(torch.mean(a, dim=0)) # 求和 print(torch.sum(a)) # 對第一個維度求和 print(torch.sum(a, dim=0)) # 累積 print(torch.prod(a)) # 對第一個維度求累積 print(torch.prod(a, dim=0)) # 求第一個維度的最大值索引 print(torch.argmax(a, dim=0)) # 求第一個維度的最小值索引 print(torch.argmin(a, dim=0)) # 計算標準差 print(torch.std(a)) # 計算方差 print(torch.var(a)) # 獲取中數(中間的數) print(torch.median(a)) # 獲取衆數(出現次數最多的數) print(torch.mode(a)) a = torch.rand(2, 2) * 10 print(a) # 打印直方圖 print(torch.histc(a, 6, 0, 0)) a = torch.randint(0, 10, [10]) print(a) print(torch.bincount(a))
運行結果
tensor([[0.3333, 0.3611],
[0.4208, 0.6395]])
tensor(0.4387)
tensor([0.3771, 0.5003])
tensor(1.7547)
tensor([0.7541, 1.0006])
tensor(0.0324)
tensor([0.1403, 0.2309])
tensor([1, 1])
tensor([0, 0])
tensor(0.1388)
tensor(0.0193)
tensor(0.3611)
torch.return_types.mode(
values=tensor([0.3333, 0.4208]),
indices=tensor([0, 0]))
tensor([[1.9862, 7.6381],
[9.2323, 7.4402]])
tensor([1., 0., 0., 0., 2., 1.])
tensor([1, 1, 4, 1, 2, 7, 9, 1, 4, 7])
tensor([0, 4, 1, 0, 2, 0, 0, 2, 0, 1])
隨機抽樣
import torch if __name__ == '__main__': # 定義隨機種子 torch.manual_seed(1) # 均值 mean = torch.rand(1, 2) # 標準差 std = torch.rand(1, 2) # 正態分佈 print(torch.normal(mean, std))
運行結果
tensor([[0.7825, 0.7358]])
範數運算
import torch if __name__ == '__main__': a = torch.rand(2, 1) b = torch.rand(2, 1) print(a) print(b) # 計算1、2、3範數 print(torch.dist(a, b, p=1)) print(torch.dist(a, b, p=2)) print(torch.dist(a, b, p=3)) # a的2範數 print(torch.norm(a)) # a的3範數 print(torch.norm(a, p=3)) # a的核範數 print(torch.norm(a, p='fro'))
運行結果
tensor([[0.3291],
[0.8294]])
tensor([[0.0810],
[0.9734]])
tensor(0.3921)
tensor(0.2869)
tensor(0.2633)
tensor(0.8923)
tensor(0.8463)
tensor(0.8923)
關於範數的內容可以參考機器學習算法整理 中的歐拉距離、明可夫斯基距離以及機器學習算法整理(二) 中L1正則,L2正則
張量裁剪
import torch if __name__ == '__main__': a = torch.rand(2, 2) * 10 print(a) # 將小於2的變成2,大於5的變成5,2~5之間的不變 a.clamp_(2, 5) print(a)
運行結果
tensor([[1.6498, 5.2090],
[9.7682, 2.3269]])
tensor([[2.0000, 5.0000],
[5.0000, 2.3269]])
張量的索引與數據篩選
import torch if __name__ == '__main__': a = torch.rand(4, 4) b = torch.rand(4, 4) print(a) print(b) # 如果a中的值大於0.5則輸出a中的值,否則輸出b中的值 out = torch.where(a > 0.5, a, b) print(out) # 列出a比b大的座標 out = torch.where(a > b) print(out) # 挑選a的第一個維度的第0行、第3行、第2行構建出一個新的tensor out = torch.index_select(a, dim=0, index=torch.tensor([0, 3, 2])) print(out) # 挑選a的第二個維度的第0列、第3列、第2列構建出一個新的tensor out = torch.index_select(a, dim=1, index=torch.tensor([0, 3, 2])) print(out) a = torch.linspace(1, 16, 16).view(4, 4) print(a) # 挑選a的第一個維度的行索引取值,而其本身的位置代表取行索引的第幾列 out = torch.gather(a, dim=0, index=torch.tensor([[0, 1, 1, 1], [0, 1, 2, 2], [0, 1, 3, 3]])) print(out) # 挑選a的第二個維度的索引取值 out = torch.gather(a, dim=1, index=torch.tensor([[0, 1, 1, 1], [0, 1, 2, 2], [0, 1, 3, 3]])) print(out) mask = torch.gt(a, 8) print(mask) # 挑選mask中爲True的值 out = torch.masked_select(a, mask) print(out) # 先將a扁平化處理成向量,再獲取索引上的值,輸出也是一個向量 out = torch.take(a, index=torch.tensor([0, 15, 13, 10])) print(out) a = torch.tensor([[0, 1, 2, 0], [2, 3, 0, 1]]) # 獲取非0元素的座標 out = torch.nonzero(a) print(out)
運行結果
tensor([[0.2301, 0.4003, 0.6914, 0.4822],
[0.7194, 0.8242, 0.1110, 0.6387],
[0.7997, 0.9174, 0.6136, 0.7631],
[0.9998, 0.5568, 0.9027, 0.7765]])
tensor([[0.4136, 0.4748, 0.0058, 0.3138],
[0.7306, 0.7052, 0.5451, 0.1708],
[0.0622, 0.9961, 0.7769, 0.2812],
[0.5140, 0.5198, 0.2314, 0.2854]])
tensor([[0.4136, 0.4748, 0.6914, 0.3138],
[0.7194, 0.8242, 0.5451, 0.6387],
[0.7997, 0.9174, 0.6136, 0.7631],
[0.9998, 0.5568, 0.9027, 0.7765]])
(tensor([0, 0, 1, 1, 2, 2, 3, 3, 3, 3]), tensor([2, 3, 1, 3, 0, 3, 0, 1, 2, 3]))
tensor([[0.2301, 0.4003, 0.6914, 0.4822],
[0.9998, 0.5568, 0.9027, 0.7765],
[0.7997, 0.9174, 0.6136, 0.7631]])
tensor([[0.2301, 0.4822, 0.6914],
[0.7194, 0.6387, 0.1110],
[0.7997, 0.7631, 0.6136],
[0.9998, 0.7765, 0.9027]])
tensor([[ 1., 2., 3., 4.],
[ 5., 6., 7., 8.],
[ 9., 10., 11., 12.],
[13., 14., 15., 16.]])
tensor([[ 1., 6., 7., 8.],
[ 1., 6., 11., 12.],
[ 1., 6., 15., 16.]])
tensor([[ 1., 2., 2., 2.],
[ 5., 6., 7., 7.],
[ 9., 10., 12., 12.]])
tensor([[False, False, False, False],
[False, False, False, False],
[ True, True, True, True],
[ True, True, True, True]])
tensor([ 9., 10., 11., 12., 13., 14., 15., 16.])
tensor([ 1., 16., 14., 11.])
tensor([[0, 1],
[0, 2],
[1, 0],
[1, 1],
[1, 3]])
張量的組合與拼接
import torch if __name__ == '__main__': a = torch.zeros((2, 4)) b = torch.ones((2, 4)) # 在第一個維度上進行拼接,此時不會增加維度,只會增加該維度的條數 out = torch.cat((a, b), dim=0) print(out) a = torch.linspace(1, 6, 6).view(2, 3) b = torch.linspace(7, 12, 6).view(2, 3) print(a) print(b) # 將a和b看成獨立元素進行拼接,此時會增加一個維度,該維度表示a和b的2個元素 out = torch.stack((a, b), dim=0) print(out) print(out.shape) # 將a的每一行跟b的相同序號的一行拼接,拼接後形成一個維度 out = torch.stack((a, b), dim=1) print(out) print(out.shape) # 此處可以打印出原始的a print(out[:, 0, :]) # 此處可以打印出原始的b print(out[:, 1, :]) # 將a的每一行每一列跟b的相同序號的行相同序號的列拼接,拼接後形成一個維度 out = torch.stack((a, b), dim=2) print(out) print(out.shape) # 此處可以打印出原始的a print(out[:, :, 0]) # 此處可以打印出原始的b print(out[:, :, 1])
運行結果
tensor([[0., 0., 0., 0.],
[0., 0., 0., 0.],
[1., 1., 1., 1.],
[1., 1., 1., 1.]])
tensor([[1., 2., 3.],
[4., 5., 6.]])
tensor([[ 7., 8., 9.],
[10., 11., 12.]])
tensor([[[ 1., 2., 3.],
[ 4., 5., 6.]],
[[ 7., 8., 9.],
[10., 11., 12.]]])
torch.Size([2, 2, 3])
tensor([[[ 1., 2., 3.],
[ 7., 8., 9.]],
[[ 4., 5., 6.],
[10., 11., 12.]]])
torch.Size([2, 2, 3])
tensor([[1., 2., 3.],
[4., 5., 6.]])
tensor([[ 7., 8., 9.],
[10., 11., 12.]])
tensor([[[ 1., 7.],
[ 2., 8.],
[ 3., 9.]],
[[ 4., 10.],
[ 5., 11.],
[ 6., 12.]]])
torch.Size([2, 3, 2])
tensor([[1., 2., 3.],
[4., 5., 6.]])
tensor([[ 7., 8., 9.],
[10., 11., 12.]])
張量切片
import torch if __name__ == '__main__': a = torch.rand(3, 4) print(a) # 對a的第一個維度進行切片,第一個維度是3行,無法進行兩兩切片, # 第一個切片爲2,第二個切片爲1 out = torch.chunk(a, 2, dim=0) print(out) # 對a的第二個維度進行切片,第二個維度是4列,可以進行兩兩切片, out = torch.chunk(a, 2, dim=1) print(out) out = torch.split(a, 2, dim=0) print(out) out = torch.split(a, 2, dim=1) print(out) # 按照list指定的長度進行切分 out = torch.split(a, [1, 1, 1], dim=0) print(out)
運行結果
tensor([[0.5683, 0.0800, 0.2068, 0.8908],
[0.8924, 0.8733, 0.6078, 0.8697],
[0.0428, 0.0265, 0.3515, 0.3164]])
(tensor([[0.5683, 0.0800, 0.2068, 0.8908],
[0.8924, 0.8733, 0.6078, 0.8697]]), tensor([[0.0428, 0.0265, 0.3515, 0.3164]]))
(tensor([[0.5683, 0.0800],
[0.8924, 0.8733],
[0.0428, 0.0265]]), tensor([[0.2068, 0.8908],
[0.6078, 0.8697],
[0.3515, 0.3164]]))
(tensor([[0.5683, 0.0800, 0.2068, 0.8908],
[0.8924, 0.8733, 0.6078, 0.8697]]), tensor([[0.0428, 0.0265, 0.3515, 0.3164]]))
(tensor([[0.5683, 0.0800],
[0.8924, 0.8733],
[0.0428, 0.0265]]), tensor([[0.2068, 0.8908],
[0.6078, 0.8697],
[0.3515, 0.3164]]))
(tensor([[0.5683, 0.0800, 0.2068, 0.8908]]), tensor([[0.8924, 0.8733, 0.6078, 0.8697]]), tensor([[0.0428, 0.0265, 0.3515, 0.3164]]))
張量變形
import torch if __name__ == '__main__': a = torch.rand(2, 3) print(a) # 先將a扁平化,再轉成需要的形狀 out = torch.reshape(a, (3, 2)) print(out) # 轉置 print(torch.t(out)) a = torch.rand(1, 2, 3) print(a) # 交換兩個維度 out = torch.transpose(a, 0, 1) print(out) print(out.shape) # 去除值爲1的維度 out = torch.squeeze(a) print(out) print(out.shape) # 在指定位置(這裏是最後一個位置)增加一個值爲1的維度 out = torch.unsqueeze(a, -1) print(out) print(out.shape) # 在第二個維度進行切分 out = torch.unbind(a, dim=1) print(out) # 在第二個維度進行翻轉 print(torch.flip(a, dims=[1])) # 在第三個維度進行翻轉 print(torch.flip(a, dims=[2])) # 對第二、第三個維度都翻轉 print(torch.flip(a, dims=[1, 2])) # 逆時針旋轉90度 out = torch.rot90(a) print(out) print(out.shape) # 逆時針旋轉180度 out = torch.rot90(a, 2) print(out) print(out.shape) # 逆時針旋轉360度 out = torch.rot90(a, 4) print(out) print(out.shape) # 順時針旋轉90度 out = torch.rot90(a, -1) print(out) print(out.shape)
運行結果
tensor([[7.7492e-01, 8.1314e-01, 6.4422e-01],
[9.8577e-01, 5.3938e-01, 2.3049e-04]])
tensor([[7.7492e-01, 8.1314e-01],
[6.4422e-01, 9.8577e-01],
[5.3938e-01, 2.3049e-04]])
tensor([[7.7492e-01, 6.4422e-01, 5.3938e-01],
[8.1314e-01, 9.8577e-01, 2.3049e-04]])
tensor([[[0.4003, 0.3144, 0.7292],
[0.0459, 0.5821, 0.7332]]])
tensor([[[0.4003, 0.3144, 0.7292]],
[[0.0459, 0.5821, 0.7332]]])
torch.Size([2, 1, 3])
tensor([[0.4003, 0.3144, 0.7292],
[0.0459, 0.5821, 0.7332]])
torch.Size([2, 3])
tensor([[[[0.4003],
[0.3144],
[0.7292]],
[[0.0459],
[0.5821],
[0.7332]]]])
torch.Size([1, 2, 3, 1])
(tensor([[0.4003, 0.3144, 0.7292]]), tensor([[0.0459, 0.5821, 0.7332]]))
tensor([[[0.0459, 0.5821, 0.7332],
[0.4003, 0.3144, 0.7292]]])
tensor([[[0.7292, 0.3144, 0.4003],
[0.7332, 0.5821, 0.0459]]])
tensor([[[0.7332, 0.5821, 0.0459],
[0.7292, 0.3144, 0.4003]]])
tensor([[[0.0459, 0.5821, 0.7332]],
[[0.4003, 0.3144, 0.7292]]])
torch.Size([2, 1, 3])
tensor([[[0.0459, 0.5821, 0.7332],
[0.4003, 0.3144, 0.7292]]])
torch.Size([1, 2, 3])
tensor([[[0.4003, 0.3144, 0.7292],
[0.0459, 0.5821, 0.7332]]])
torch.Size([1, 2, 3])
tensor([[[0.4003, 0.3144, 0.7292]],
[[0.0459, 0.5821, 0.7332]]])
torch.Size([2, 1, 3])
張量填充
import torch if __name__ == '__main__': # 定義一個2*3的全10矩陣 a = torch.full((2, 3), 10) print(a)
運行結果
tensor([[10, 10, 10],
[10, 10, 10]])
求導數
import torch from torch.autograd import Variable if __name__ == '__main__': # 等價於 x = Variable(torch.ones(2, 2), requires_grad=True) # requires_grad=True表示加入到反向傳播圖中參與計算 x = torch.ones(2, 2, requires_grad=True) y = x + 2 z = y**2 * 3 # 對x求導 # 等價於 torch.autograd.backward(z, grad_tensors=torch.ones(2, 2)) z.backward(torch.ones(2, 2)) print(x.grad) print(y.grad) print(x.grad_fn) print(y.grad_fn) print(z.grad_fn)
運行結果
tensor([[18., 18.],
[18., 18.]])
None
None
<AddBackward0 object at 0x7fad8f1b0c50>
<MulBackward0 object at 0x7fad8f1b0c50>
這裏是一個複合函數求導,z'(x)=6y*(x+2)'=6y*1=6(1+2)=18,由於有4個1,所以有4個18。關於導數的計算方法可以參考高等數學整理 一元函數的導數與微分
目前Variable已經和Tensor合併。每個tensor通過requires_grad來設置是否計算梯度。用來凍結某些層的參數,比如預訓練網絡的參數不做更新,只訓練後面的網絡參數。又或者多任務網絡中,我們只對某個分支進行計算參數,而其他分支的參數進行凍結。
- 關於Autograd的幾個概念
1、葉子張量(leaf)
上圖中表示了在PyTorch中想要完成的一組計算,在這個圖中X稱之爲葉子張量,X也是一個葉子節點。在PyTorch中想要計算梯度,必須滿足的一個要求就是當前的節點屬於葉子節點,只有是一個葉子張量,我們才能計算它的梯度。如果我們打印Y的梯度,返回的是None。
2、grad VS grad_fn
- grad:該Tensor的梯度值,每次在計算backward時都需要將前一時刻的梯度歸零,否則梯度值會一直累加。
- grad_fn:葉子節點通常爲None,只有結果節點的grad_fn纔有效,用於指示梯度函數是哪種類型.
3、backward函數
torch.autograd.backward(tensors,grad_tensors=None,retain_graph=None,create_graph=False)
- tensor:用於計算梯度的tensor,torch.autograd.backward(z)==z.backward(),這裏指的的是常量的情況下,如果是變量的話需要指定grad_tensors
- grad_tensors:在計算矩陣的梯度時會用到。它其實也是一個tensor,shape一般需要和前面的tensor保持一致。
- retain_graph:通常在調用一次backward後,pytorch會自動把計算圖銷燬,所以想要對某個變量重複調用backward,則需要將該參數設置爲True。
- create_graph:如果爲True,那麼就創建一個專門的graph of the derivative,這可以方便計算高階微分。
-
def grad( outputs: _TensorOrTensors, inputs: _TensorOrTensors, grad_outputs: Optional[_TensorOrTensors] = None, retain_graph: Optional[bool] = None, create_graph: bool = False, only_inputs: bool = True, allow_unused: bool = False )
計算和返回outputs關於inputs的梯度的和。outputs:函數的因變量,即需要求導的那個函數。inputs:函數的自變量。grad_outputs:同backward。only_inputs:只計算inputs的梯度。allow_unused(bool,可選):如果爲False,當計算輸出出錯時(因此它們的梯度永遠是0)指明不使用的inputs。
4、torch.autograd包中的其他函數
- torch.autograd.enable_grad:啓動梯度計算的上下文管理器
- torch.autograd.no_grad:禁止梯度計算的上下文管理器
- torch.autograd.set_grad_enabled(mode):設置是否進行梯度計算的上下文管理器。
Autograd中的function
import torch class line(torch.autograd.Function): @staticmethod def forward(ctx, w, x, b): ''' 前向運算 :param ctx: 上下文管理器 :param w: :param x: :param b: :return: ''' # y = wx + b ctx.save_for_backward(w, x, b) return w * x + b @staticmethod def backward(ctx, grad_out): ''' 反向傳播 :param ctx: 上下文管理器 :param grad_out: 上一級梯度 :return: ''' w, x, b = ctx.saved_tensors # w的偏導數 grad_w = grad_out * x # x的偏導數 grad_x = grad_out * w # b的偏導數 grad_b = grad_out return grad_w, grad_x, grad_b if __name__ == '__main__': w = torch.rand(2, 2, requires_grad=True) x = torch.rand(2, 2, requires_grad=True) b = torch.rand(2, 2, requires_grad=True) out = line.apply(w, x, b) out.backward(torch.ones(2, 2)) print(w, x, b) print(w.grad, x.grad, b.grad)
運行結果
tensor([[0.7784, 0.2882],
[0.7826, 0.8178]], requires_grad=True) tensor([[0.4062, 0.4722],
[0.7921, 0.9470]], requires_grad=True) tensor([[0.7012, 0.9489],
[0.2466, 0.1548]], requires_grad=True)
tensor([[0.4062, 0.4722],
[0.7921, 0.9470]]) tensor([[0.7784, 0.2882],
[0.7826, 0.8178]]) tensor([[1., 1.],
[1., 1.]])
這是一個對線性函數求偏導數的過程,通過結果我們可以看到w的偏導數是x,x的偏導數是w,b的偏導數是1。
- 每一個原始的自動求導運算實際上是兩個在Tensor上運行的函數
- forward函數計算從輸入Tensors獲得的輸出Tensors
- backward函數接收輸出Tensors對於某個標量值的梯度,並且計算輸入Tensors相對於該相同標量值的梯度
- 最後,利用apply方法執行相應的運算,該方法是定義在Function類的父類_FunctionBase中的一個方法
非0值填充
這個是相對於普通padding而言的。
import torch if __name__ == '__main__': a = torch.arange(9, dtype=torch.float).reshape((1, 3, 3)) print(a) m = torch.nn.ReflectionPad2d(1) # 在a的周邊填充非0值 out = m(a) print(out)
運行結果
tensor([[[0., 1., 2.],
[3., 4., 5.],
[6., 7., 8.]]])
tensor([[[4., 3., 4., 5., 4.],
[1., 0., 1., 2., 1.],
[4., 3., 4., 5., 4.],
[7., 6., 7., 8., 7.],
[4., 3., 4., 5., 4.]]])
神經網絡的搭建
有關神經網絡的內容可以參考Tensorflow深度學習算法整理 ,這裏不再贅述。
波士頓房價預測
我們先來看一下數據
import numpy as np import torch from sklearn import datasets if __name__ == '__main__': boston = datasets.load_boston() X = torch.from_numpy(boston.data) y = torch.from_numpy(boston.target) y = torch.unsqueeze(y, -1) data = torch.cat((X, y), dim=-1) print(data) print(data.shape)
運行結果
tensor([[6.3200e-03, 1.8000e+01, 2.3100e+00, ..., 3.9690e+02, 4.9800e+00,
2.4000e+01],
[2.7310e-02, 0.0000e+00, 7.0700e+00, ..., 3.9690e+02, 9.1400e+00,
2.1600e+01],
[2.7290e-02, 0.0000e+00, 7.0700e+00, ..., 3.9283e+02, 4.0300e+00,
3.4700e+01],
...,
[6.0760e-02, 0.0000e+00, 1.1930e+01, ..., 3.9690e+02, 5.6400e+00,
2.3900e+01],
[1.0959e-01, 0.0000e+00, 1.1930e+01, ..., 3.9345e+02, 6.4800e+00,
2.2000e+01],
[4.7410e-02, 0.0000e+00, 1.1930e+01, ..., 3.9690e+02, 7.8800e+00,
1.1900e+01]], dtype=torch.float64)
torch.Size([506, 14])
import torch from sklearn import datasets if __name__ == '__main__': boston = datasets.load_boston() X = torch.from_numpy(boston.data) y = torch.from_numpy(boston.target) y = torch.unsqueeze(y, -1) data = torch.cat((X, y), dim=-1) print(data) print(data.shape) y = torch.squeeze(y) X_train = X[:496] y_train = y[:496] X_test = X[496:] y_test = y[496:] class Net(torch.nn.Module): def __init__(self, n_feature, n_output): super(Net, self).__init__() self.hidden = torch.nn.Linear(n_feature, 100) self.predict = torch.nn.Linear(100, n_output) def forward(self, x): out = self.hidden(x) out = torch.relu(out) out = self.predict(out) return out net = Net(13, 1) loss_func = torch.nn.MSELoss() optimizer = torch.optim.Adam(net.parameters(), lr=0.01) for i in range(10000): pred = net.forward(X_train.float()) pred = torch.squeeze(pred) loss = loss_func(pred, y_train.float()) * 0.001 optimizer.zero_grad() loss.backward() optimizer.step() print("item:{},loss:{}".format(i, loss)) print(pred[:10]) print(y_train[:10]) pred = net.forward(X_test.float()) pred = torch.squeeze(pred) loss_test = loss_func(pred, y_test.float()) * 0.001 print("item:{},loss_test:{}".format(i, loss_test)) print(pred[:10]) print(y_test[:10])
運行結果(最終訓練結果)
item:9999,loss:0.0034487966913729906
tensor([26.7165, 22.6610, 33.0955, 34.6687, 36.8087, 29.3654, 22.9609, 20.9920,
17.1832, 20.8744], grad_fn=<SliceBackward0>)
tensor([24.0000, 21.6000, 34.7000, 33.4000, 36.2000, 28.7000, 22.9000, 27.1000,
16.5000, 18.9000], dtype=torch.float64)
item:9999,loss_test:0.007662008982151747
tensor([14.5801, 18.2911, 21.3332, 16.9826, 19.6432, 21.8298, 18.5557, 23.6807,
22.3610, 18.0118], grad_fn=<SliceBackward0>)
tensor([19.7000, 18.3000, 21.2000, 17.5000, 16.8000, 22.4000, 20.6000, 23.9000,
22.0000, 11.9000], dtype=torch.float64)
手寫數字識別
import torch import torchvision.datasets as dataset import torchvision.transforms as transforms import torch.utils.data as data_utils if __name__ == '__main__': train_data = dataset.MNIST(root='mnist', train=True, transform=transforms.ToTensor(), download=True) test_data = dataset.MNIST(root='mnist', train=False, transform=transforms.ToTensor(), download=False) train_loader = data_utils.DataLoader(dataset=train_data, batch_size=64, shuffle=True) test_loader = data_utils.DataLoader(dataset=test_data, batch_size=64, shuffle=True) class CNN(torch.nn.Module): def __init__(self): super(CNN, self).__init__() self.conv = torch.nn.Sequential( torch.nn.Conv2d(1, 32, kernel_size=5, padding=2), torch.nn.BatchNorm2d(32), torch.nn.ReLU(), torch.nn.MaxPool2d(2) ) self.fc = torch.nn.Linear(14 * 14 * 32, 10) def forward(self, x): out = self.conv(x) out = out.view(out.size()[0], -1) out = self.fc(out) return out cnn = CNN() loss_func = torch.nn.CrossEntropyLoss() optimizer = torch.optim.Adam(cnn.parameters(), lr=0.01) for epoch in range(10): for i, (images, labels) in enumerate(train_loader): outputs = cnn(images) loss = loss_func(outputs, labels) optimizer.zero_grad() loss.backward() optimizer.step() print("epoch is {}, ite is {}/{}, loss is {}".format(epoch + 1, i, len(train_data) // 64, loss.item())) loss_test = 0 accuracy = 0 for i, (images, labels) in enumerate(test_loader): outputs = cnn(images) loss_test += loss_func(outputs, labels) _, pred = outputs.max(1) accuracy += (pred == labels).sum().item() accuracy = accuracy / len(test_data) loss_test = loss_test / (len(test_data) // 64) print("epoch is {}, accuracy is {}, loss test is {}".format(epoch + 1, accuracy, loss_test.item()))
運行結果
epoch is 1, ite is 937/937, loss is 0.08334837108850479
epoch is 1, accuracy is 0.9814, loss test is 0.06306721270084381
epoch is 2, ite is 937/937, loss is 0.08257070928812027
epoch is 2, accuracy is 0.9824, loss test is 0.05769834667444229
epoch is 3, ite is 937/937, loss is 0.02539072372019291
epoch is 3, accuracy is 0.9823, loss test is 0.05558949336409569
epoch is 4, ite is 937/937, loss is 0.014101949520409107
epoch is 4, accuracy is 0.982, loss test is 0.05912528932094574
epoch is 5, ite is 937/937, loss is 0.0016860843170434237
epoch is 5, accuracy is 0.9835, loss test is 0.05862809345126152
epoch is 6, ite is 937/937, loss is 0.04285441339015961
epoch is 6, accuracy is 0.9817, loss test is 0.06716518104076385
epoch is 7, ite is 937/937, loss is 0.0026565147563815117
epoch is 7, accuracy is 0.9831, loss test is 0.05950026586651802
epoch is 8, ite is 937/937, loss is 0.02730828896164894
epoch is 8, accuracy is 0.9824, loss test is 0.058563172817230225
epoch is 9, ite is 937/937, loss is 0.00010762683814391494
epoch is 9, accuracy is 0.9828, loss test is 0.0673145055770874
epoch is 10, ite is 937/937, loss is 0.0021532117389142513
epoch is 10, accuracy is 0.9852, loss test is 0.0562417209148407
Cifar10圖像分類
- VggNet網絡結構
VggNet是一個標準的串聯網絡結構,網絡的深度不宜太深,否則會造成梯度消失
import torch import torch.nn.functional as F import torchvision.datasets as dataset import torchvision.transforms as transforms import torch.utils.data as data_utils if __name__ == '__main__': train_data = dataset.CIFAR10(root='cifa', train=True, transform=transforms.ToTensor(), download=True) test_data = dataset.CIFAR10(root='cifa', train=False, transform=transforms.ToTensor(), download=False) train_loader = data_utils.DataLoader(dataset=train_data, batch_size=64, shuffle=True) test_loader = data_utils.DataLoader(dataset=test_data, batch_size=64, shuffle=True) class VGGbase(torch.nn.Module): def __init__(self): super(VGGbase, self).__init__() self.conv1 = torch.nn.Sequential( torch.nn.Conv2d(3, 64, kernel_size=3, stride=1, padding=1), torch.nn.BatchNorm2d(64), torch.nn.ReLU() ) self.max_pooling1 = torch.nn.MaxPool2d(kernel_size=2, stride=2) self.conv2_1 = torch.nn.Sequential( torch.nn.Conv2d(64, 128, kernel_size=3, stride=1, padding=1), torch.nn.BatchNorm2d(128), torch.nn.ReLU() ) self.conv2_2 = torch.nn.Sequential( torch.nn.Conv2d(128, 128, kernel_size=3, stride=1, padding=1), torch.nn.BatchNorm2d(128), torch.nn.ReLU() ) self.max_pooling2 = torch.nn.MaxPool2d(kernel_size=2, stride=2) self.conv3_1 = torch.nn.Sequential( torch.nn.Conv2d(128, 256, kernel_size=3, stride=1, padding=1), torch.nn.BatchNorm2d(256), torch.nn.ReLU() ) self.conv3_2 = torch.nn.Sequential( torch.nn.Conv2d(256, 256, kernel_size=3, stride=1, padding=1), torch.nn.BatchNorm2d(256), torch.nn.ReLU() ) self.max_pooling3 = torch.nn.MaxPool2d(kernel_size=2, stride=2, padding=1) self.conv4_1 = torch.nn.Sequential( torch.nn.Conv2d(256, 512, kernel_size=3, stride=1, padding=1), torch.nn.BatchNorm2d(512), torch.nn.ReLU() ) self.conv4_2 = torch.nn.Sequential( torch.nn.Conv2d(512, 512, kernel_size=3, stride=1, padding=1), torch.nn.BatchNorm2d(512), torch.nn.ReLU() ) self.max_pooling4 = torch.nn.MaxPool2d(kernel_size=2, stride=2) self.fc = torch.nn.Linear(512 * 4, 10) def forward(self, x): batchsize = x.size()[0] out = self.conv1(x) out = self.max_pooling1(out) out = self.conv2_1(out) out = self.conv2_2(out) out = self.max_pooling2(out) out = self.conv3_1(out) out = self.conv3_2(out) out = self.max_pooling3(out) out = self.conv4_1(out) out = self.conv4_2(out) out = self.max_pooling4(out) out = out.view(batchsize, -1) out = self.fc(out) out = F.log_softmax(out, dim=1) return out device = torch.device("cuda" if torch.cuda.is_available() else "cpu") epoch_num = 200 lr = 0.01 net = VGGbase().to(device) loss_func = torch.nn.CrossEntropyLoss() optimizer = torch.optim.Adam(net.parameters(), lr=lr) # 每隔5個epoch進行指數衰減,變成上一次學習率的0.9倍 scheduler = torch.optim.lr_scheduler.StepLR(optimizer, step_size=5, gamma=0.9) for epoch in range(epoch_num): net.train() for i, (images, labels) in enumerate(train_loader): images, labels = images.to(device), labels.to(device) outputs = net(images) loss = loss_func(outputs, labels) optimizer.zero_grad() loss.backward() optimizer.step() _, pred = torch.max(outputs, dim=1) correct = pred.eq(labels.data).cpu().sum() print("epoch is ", epoch, "step ", i, "loss is: ", loss.item(), "mini-batch correct is: ", (100.0 * correct / 64).item())
運行結果(部分)
epoch is 32 step 605 loss is: 0.010804953053593636 mini-batch correct is: 100.0
epoch is 32 step 606 loss is: 0.02166593447327614 mini-batch correct is: 98.4375
epoch is 32 step 607 loss is: 0.1924218237400055 mini-batch correct is: 95.3125
epoch is 32 step 608 loss is: 0.04531310871243477 mini-batch correct is: 96.875
epoch is 32 step 609 loss is: 0.03866473212838173 mini-batch correct is: 98.4375
epoch is 32 step 610 loss is: 0.0039138575084507465 mini-batch correct is: 100.0
epoch is 32 step 611 loss is: 0.009379544295370579 mini-batch correct is: 100.0
epoch is 32 step 612 loss is: 0.2707091271877289 mini-batch correct is: 93.75
epoch is 32 step 613 loss is: 0.016424348577857018 mini-batch correct is: 100.0
epoch is 32 step 614 loss is: 0.001230329042300582 mini-batch correct is: 100.0
epoch is 32 step 615 loss is: 0.013688713312149048 mini-batch correct is: 100.0
epoch is 32 step 616 loss is: 0.0062867505475878716 mini-batch correct is: 100.0
epoch is 32 step 617 loss is: 0.005267560016363859 mini-batch correct is: 100.0
- ResNet網絡結構
ResNet是一個級聯+串聯的網絡結構,它的網絡可以搭的很深,不會造成梯度消失。
import torch import torch.nn.functional as F import torchvision.datasets as dataset import torchvision.transforms as transforms import torch.utils.data as data_utils if __name__ == '__main__': train_data = dataset.CIFAR10(root='cifa', train=True, transform=transforms.ToTensor(), download=True) test_data = dataset.CIFAR10(root='cifa', train=False, transform=transforms.ToTensor(), download=False) train_loader = data_utils.DataLoader(dataset=train_data, batch_size=64, shuffle=True) test_loader = data_utils.DataLoader(dataset=test_data, batch_size=64, shuffle=True) class ResBlock(torch.nn.Module): def __init__(self, in_channel, out_channel, stride=1): super(ResBlock, self).__init__() self.layer = torch.nn.Sequential( torch.nn.Conv2d(in_channel, out_channel, kernel_size=3, stride=stride, padding=1), torch.nn.BatchNorm2d(out_channel), torch.nn.ReLU(), torch.nn.Conv2d(out_channel, out_channel, kernel_size=3, stride=1, padding=1), torch.nn.BatchNorm2d(out_channel) ) self.shortcut = torch.nn.Sequential() if in_channel != out_channel or stride > 1: self.shortcut = torch.nn.Sequential( torch.nn.Conv2d(in_channel, out_channel, kernel_size=3, stride=stride, padding=1), torch.nn.BatchNorm2d(out_channel) ) def forward(self, x): out1 = self.layer(x) out2 = self.shortcut(x) out = out1 + out2 out = F.relu(out) return out class ResNet(torch.nn.Module): def make_layer(self, block, out_channel, stride, num_block): layers_list = [] for i in range(num_block): if i == 0: in_stride = stride else: in_stride = 1 layers_list.append(block(self.in_channel, out_channel, in_stride)) self.in_channel = out_channel return torch.nn.Sequential(*layers_list) def __init__(self): super(ResNet, self).__init__() self.in_channel = 32 self.conv1 = torch.nn.Sequential( torch.nn.Conv2d(3, 32, kernel_size=3, stride=1, padding=1), torch.nn.BatchNorm2d(32), torch.nn.ReLU() ) self.layer1 = self.make_layer(ResBlock, 64, 2, 2) self.layer2 = self.make_layer(ResBlock, 128, 2, 2) self.layer3 = self.make_layer(ResBlock, 256, 2, 2) self.layer4 = self.make_layer(ResBlock, 512, 2, 2) self.fc = torch.nn.Linear(512, 10) def forward(self, x): out = self.conv1(x) out = self.layer1(out) out = self.layer2(out) out = self.layer3(out) out = self.layer4(out) out = F.avg_pool2d(out, 2) out = out.view(out.size()[0], -1) out = self.fc(out) out = F.log_softmax(out, dim=1) return out device = torch.device("cuda" if torch.cuda.is_available() else "cpu") epoch_num = 200 lr = 0.01 net = ResNet().to(device) loss_func = torch.nn.CrossEntropyLoss() optimizer = torch.optim.Adam(net.parameters(), lr=lr) # 每隔5個epoch進行指數衰減,變成上一次學習率的0.9倍 scheduler = torch.optim.lr_scheduler.StepLR(optimizer, step_size=5, gamma=0.9) for epoch in range(epoch_num): net.train() for i, (images, labels) in enumerate(train_loader): images, labels = images.to(device), labels.to(device) outputs = net(images) loss = loss_func(outputs, labels) optimizer.zero_grad() loss.backward() optimizer.step() _, pred = torch.max(outputs, dim=1) correct = pred.eq(labels.data).cpu().sum() print("epoch is ", epoch, "step ", i, "loss is: ", loss.item(), "mini-batch correct is: ", (100.0 * correct / 64).item())
- MobileNet網絡結構
有關MobileNet的詳細說明可以參考Tensorflow深度學習算法整理 中的MobileNet,MobileNet能夠通過分組卷積和1*1卷積結合代替掉一個標準的卷積單元,進而壓縮計算量和參數量。MobileNet能夠實現更輕量型的網絡結構。
import torch import torch.nn.functional as F import torchvision.datasets as dataset import torchvision.transforms as transforms import torch.utils.data as data_utils if __name__ == '__main__': train_data = dataset.CIFAR10(root='cifa', train=True, transform=transforms.ToTensor(), download=True) test_data = dataset.CIFAR10(root='cifa', train=False, transform=transforms.ToTensor(), download=False) train_loader = data_utils.DataLoader(dataset=train_data, batch_size=64, shuffle=True) test_loader = data_utils.DataLoader(dataset=test_data, batch_size=64, shuffle=True) class MobileNet(torch.nn.Module): def conv_dw(self, in_channel, out_channel, stride): return torch.nn.Sequential( # 深度可分離卷積 torch.nn.Conv2d(in_channel, in_channel, kernel_size=3, stride=stride, padding=1, groups=in_channel, bias=False), torch.nn.BatchNorm2d(in_channel), torch.nn.ReLU(), torch.nn.Conv2d(in_channel, out_channel, kernel_size=1, stride=1, padding=0, bias=False), torch.nn.BatchNorm2d(out_channel), torch.nn.ReLU() ) def __init__(self): super(MobileNet, self).__init__() self.conv1 = torch.nn.Sequential( torch.nn.Conv2d(3, 32, kernel_size=3, stride=1, padding=1), torch.nn.BatchNorm2d(32), torch.nn.ReLU() ) self.conv_dw2 = self.conv_dw(32, 32, 1) self.conv_dw3 = self.conv_dw(32, 64, 2) self.conv_dw4 = self.conv_dw(64, 64, 1) self.conv_dw5 = self.conv_dw(64, 128, 2) self.conv_dw6 = self.conv_dw(128, 128, 1) self.conv_dw7 = self.conv_dw(128, 256, 2) self.conv_dw8 = self.conv_dw(256, 256, 1) self.conv_dw9 = self.conv_dw(256, 512, 2) self.fc = torch.nn.Linear(512, 10) def forward(self, x): out = self.conv1(x) out = self.conv_dw2(out) out = self.conv_dw3(out) out = self.conv_dw4(out) out = self.conv_dw5(out) out = self.conv_dw6(out) out = self.conv_dw7(out) out = self.conv_dw8(out) out = self.conv_dw9(out) out = F.avg_pool2d(out, 2) out = out.view(out.size()[0], -1) out = self.fc(out) out = F.log_softmax(out, dim=1) return out device = torch.device("cuda" if torch.cuda.is_available() else "cpu") epoch_num = 200 lr = 0.01 net = MobileNet().to(device) loss_func = torch.nn.CrossEntropyLoss() optimizer = torch.optim.Adam(net.parameters(), lr=lr) # 每隔5個epoch進行指數衰減,變成上一次學習率的0.9倍 scheduler = torch.optim.lr_scheduler.StepLR(optimizer, step_size=5, gamma=0.9) for epoch in range(epoch_num): net.train() for i, (images, labels) in enumerate(train_loader): images, labels = images.to(device), labels.to(device) outputs = net(images) loss = loss_func(outputs, labels) optimizer.zero_grad() loss.backward() optimizer.step() _, pred = torch.max(outputs, dim=1) correct = pred.eq(labels.data).cpu().sum() print("epoch is ", epoch, "step ", i, "loss is: ", loss.item(), "mini-batch correct is: ", (100.0 * correct / 64).item())
- InceptionNet網絡結構
有關InceptionNett的詳細說明可以參考Tensorflow深度學習算法整理 中的InceptionNet。InceptionNet是一種並聯和串聯相結合的網絡結構,它可以加寬網絡寬度,從而提升網絡性能的網絡結構。
import torch import torch.nn.functional as F import torchvision.datasets as dataset import torchvision.transforms as transforms import torch.utils.data as data_utils if __name__ == '__main__': train_data = dataset.CIFAR10(root='cifa', train=True, transform=transforms.ToTensor(), download=True) test_data = dataset.CIFAR10(root='cifa', train=False, transform=transforms.ToTensor(), download=False) train_loader = data_utils.DataLoader(dataset=train_data, batch_size=64, shuffle=True) test_loader = data_utils.DataLoader(dataset=test_data, batch_size=64, shuffle=True) class BaseInception(torch.nn.Module): def ConvBNRelu(self, in_channel, out_channel, kernel_size): return torch.nn.Sequential( torch.nn.Conv2d(in_channel, out_channel, kernel_size=kernel_size, stride=1, padding=kernel_size // 2), torch.nn.BatchNorm2d(out_channel), torch.nn.ReLU() ) def __init__(self, in_channel, out_channel_list, reduce_channel_list): super(BaseInception, self).__init__() self.branch1_conv = self.ConvBNRelu(in_channel, out_channel_list[0], 1) self.branch2_conv1 = self.ConvBNRelu(in_channel, reduce_channel_list[0], 1) self.branch2_conv2 = self.ConvBNRelu(reduce_channel_list[0], out_channel_list[1], 3) self.branch3_conv1 = self.ConvBNRelu(in_channel, reduce_channel_list[1], 1) self.branch3_conv2 = self.ConvBNRelu(reduce_channel_list[1], out_channel_list[2], 5) self.branch4_pool = torch.nn.MaxPool2d(kernel_size=3, stride=1, padding=1) self.branch4_conv = self.ConvBNRelu(in_channel, out_channel_list[3], 3) def forward(self, x): out1 = self.branch1_conv(x) out2 = self.branch2_conv1(x) out2 = self.branch2_conv2(out2) out3 = self.branch3_conv1(x) out3 = self.branch3_conv2(out3) out4 = self.branch4_pool(x) out4 = self.branch4_conv(out4) out = torch.cat([out1, out2, out3, out4], dim=1) return out class InceptionNet(torch.nn.Module): def __init__(self): super(InceptionNet, self).__init__() self.block1 = torch.nn.Sequential( torch.nn.Conv2d(3, 64, kernel_size=3, stride=1, padding=1), torch.nn.BatchNorm2d(64), torch.nn.ReLU() ) self.block2 = torch.nn.Sequential( torch.nn.Conv2d(64, 128, kernel_size=3, stride=2, padding=1), torch.nn.BatchNorm2d(128), torch.nn.ReLU() ) self.block3 = torch.nn.Sequential( BaseInception(in_channel=128, out_channel_list=[64, 64, 64, 64], reduce_channel_list=[16, 16]), torch.nn.MaxPool2d(kernel_size=3, stride=2, padding=1) ) self.block4 = torch.nn.Sequential( BaseInception(in_channel=256, out_channel_list=[96, 96, 96, 96], reduce_channel_list=[32, 32]), torch.nn.MaxPool2d(kernel_size=3, stride=2, padding=1) ) self.fc = torch.nn.Linear(1536, 10) def forward(self, x): out = self.block1(x) out = self.block2(out) out = self.block3(out) out = self.block4(out) out = F.avg_pool2d(out, 2) out = out.view(out.size()[0], -1) out = self.fc(out) out = F.log_softmax(out, dim=1) return out device = torch.device("cuda" if torch.cuda.is_available() else "cpu") epoch_num = 200 lr = 0.01 net = InceptionNet().to(device) loss_func = torch.nn.CrossEntropyLoss() optimizer = torch.optim.Adam(net.parameters(), lr=lr) # 每隔5個epoch進行指數衰減,變成上一次學習率的0.9倍 scheduler = torch.optim.lr_scheduler.StepLR(optimizer, step_size=5, gamma=0.9) for epoch in range(epoch_num): net.train() for i, (images, labels) in enumerate(train_loader): images, labels = images.to(device), labels.to(device) outputs = net(images) loss = loss_func(outputs, labels) optimizer.zero_grad() loss.backward() optimizer.step() _, pred = torch.max(outputs, dim=1) correct = pred.eq(labels.data).cpu().sum() print("epoch is ", epoch, "step ", i, "loss is: ", loss.item(), "mini-batch correct is: ", (100.0 * correct / 64).item())
- 使用預訓練模型
這裏我們使用ResNet18爲例,使用預訓練模型,我們不需要自己去搭建網絡模型,在不需要對模型進行剪枝的情況下使用預訓練模型可以增加我們的開發速度。
import torch import torchvision.datasets as dataset import torchvision.transforms as transforms import torch.utils.data as data_utils from torchvision import models if __name__ == '__main__': train_data = dataset.CIFAR10(root='cifa', train=True, transform=transforms.ToTensor(), download=True) test_data = dataset.CIFAR10(root='cifa', train=False, transform=transforms.ToTensor(), download=False) train_loader = data_utils.DataLoader(dataset=train_data, batch_size=64, shuffle=True) test_loader = data_utils.DataLoader(dataset=test_data, batch_size=64, shuffle=True) class ResNet18(torch.nn.Module): def __init__(self): super(ResNet18, self).__init__() # 加載預訓練模型 self.model = models.resnet18(pretrained=True) self.num_features = self.model.fc.in_features self.model.fc = torch.nn.Linear(self.num_features, 10) def forward(self, x): out = self.model(x) return out device = torch.device("cuda" if torch.cuda.is_available() else "cpu") epoch_num = 200 lr = 0.01 net = ResNet18().to(device) loss_func = torch.nn.CrossEntropyLoss() optimizer = torch.optim.Adam(net.parameters(), lr=lr) # 每隔5個epoch進行指數衰減,變成上一次學習率的0.9倍 scheduler = torch.optim.lr_scheduler.StepLR(optimizer, step_size=5, gamma=0.9) for epoch in range(epoch_num): net.train() for i, (images, labels) in enumerate(train_loader): images, labels = images.to(device), labels.to(device) outputs = net(images) loss = loss_func(outputs, labels) optimizer.zero_grad() loss.backward() optimizer.step() _, pred = torch.max(outputs, dim=1) correct = pred.eq(labels.data).cpu().sum() print("epoch is ", epoch, "step ", i, "loss is: ", loss.item(), "mini-batch correct is: ", (100.0 * correct / 64).item())
圖像增強
這是進行圖像進入神經網絡之前進行預處理的方法,叫做圖像增強。上圖中最左邊的圖片爲原始圖片,通過這張圖片來產生右邊的四張圖片。一方面可以擴充我們的數據集,將數據集擴大4倍。另一方面這些變換後的圖片可能在場景中會出現,但是在原始數據集中並沒有。我們將這些變化後的數據送給我們的模型進行學習之後,模型應對現實場景下各種各樣不一樣的圖片,就會更加的魯棒。
上圖中是一些常用的方法,整個圖片增強的方法其實非常多。最左邊的是原始圖片,Rotation表示隨機的旋轉,隨機旋轉的意義在於在拍攝圖片的時候,相機可能是斜的,如果把原始圖片旋轉一下送給模型,那麼模型就能應對這些傾斜的圖片,依然能夠準確識別圖片中的內容。Blur爲模糊,模糊的意義在於拍攝圖片的時候可能是攝像頭上有霧氣時拍攝的圖片,爲了讓模型能夠對這些圖片依然魯棒的時候,應該做這種操作。Contrast爲對比度的隨機調節,意義在不同的人對不同的對比度有不同的喜好,有些人可能希望豔麗一點,有些人喜歡暗一點。Scaling表示不同距離時的圖片,其意義在於能夠讓模型能夠處理不同遠近的圖片。Illumination表示曝光,在不同的光照條件下,讓模型能夠識別。Projective爲透視變換,表示不同角度拍攝的圖片的樣子,通過透視變換能夠來扭曲圖片空間的位置來模擬我們站在不同角度拍攝圖片的不同的樣子,使得模型也能夠應對這些圖片的樣子。
import torchvision.transforms as transforms from PIL import Image if __name__ == '__main__': trans = transforms.Compose([ transforms.ToTensor(), # 歸一化,轉成float32 transforms.RandomRotation(45), # 隨機的旋轉 transforms.RandomAffine(45), # 隨機仿射變換 # 標準化,第一個元組表示各個通道(r,g,b)的均值,第二個元組表示各個通道(r,g,b)的方差 transforms.Normalize((0.485, 0.456, 0.406), (0.229, 0.224, 0.225)) ]) unloader = transforms.ToPILImage() image = Image.open("/Users/admin/Documents/444.jpeg") print(image) image.show() image_out = trans(image) image = unloader(image_out) print(image_out.size()) image.show()
運行結果
<PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=1896x1279 at 0x7F801385F750>
torch.Size([3, 1279, 1896])
GAN網絡
GAN的主要內容可以參考Tensorflow深度學習算法整理(三)
這裏我們會實現一個CycleGAN。數據集下載地址:https://people.eecs.berkeley.edu/~taesung_park/CycleGAN/datasets/這裏我們使用apple2orange.zip這個數據集來進行訓練。
首先是數據集的讀取
import glob import random from torch.utils.data import Dataset, DataLoader from PIL import Image import torchvision.transforms as transforms import os class ImageDataset(Dataset): def __init__(self, root='', transform=None, model='train'): self.transform = transforms.Compose(transform) self.pathA = os.path.join(root, model + "A/*") self.pathB = os.path.join(root, model + "B/*") self.list_A = glob.glob(self.pathA) self.list_B = glob.glob(self.pathB) def __getitem__(self, index): im_pathA = self.list_A[index % len(self.list_A)] im_pathB = random.choice(self.list_B) im_A = Image.open(im_pathA) im_B = Image.open(im_pathB) item_A = self.transform(im_A) item_B = self.transform(im_B) return {"A": item_A, "B": item_B} def __len__(self): return max(len(self.list_A), len(self.list_B)) if __name__ == '__main__': root = "/Users/admin/Downloads/apple2orange" transform_ = [transforms.Resize(256, Image.BILINEAR), transforms.ToTensor()] dataloader = DataLoader(dataset=ImageDataset(root, transform_, 'train'), batch_size=1, shuffle=True, num_workers=1) for i, batch in enumerate(dataloader): print(i) print(batch)
生成器和判別器的模型
import torch import torch.nn.functional as F class ResBlock(torch.nn.Module): def __init__(self, in_channel): super(ResBlock, self).__init__() self.conv_block = torch.nn.Sequential( # 非0填充周邊 torch.nn.ReflectionPad2d(1), torch.nn.Conv2d(in_channel, in_channel, kernel_size=3), # 在一個通道內做歸一化 torch.nn.InstanceNorm2d(in_channel), torch.nn.ReLU(inplace=True), torch.nn.ReflectionPad2d(1), torch.nn.Conv2d(in_channel, in_channel, kernel_size=3), torch.nn.InstanceNorm2d(in_channel) ) def forward(self, x): return x + self.conv_block(x) class Generator(torch.nn.Module): '''生成器''' def __init__(self): super(Generator, self).__init__() net = [ torch.nn.ReflectionPad2d(3), torch.nn.Conv2d(3, 64, kernel_size=7), torch.nn.InstanceNorm2d(64), torch.nn.ReLU(inplace=True) ] in_channel = 64 out_channel = in_channel * 2 # 下采樣2次 for _ in range(2): net += [ torch.nn.Conv2d(in_channel, out_channel, kernel_size=3, stride=2, padding=1), torch.nn.InstanceNorm2d(out_channel), torch.nn.ReLU(inplace=True) ] in_channel = out_channel out_channel = in_channel * 2 # 做9次殘差連接 for _ in range(9): net += [ResBlock(in_channel)] # 上採樣2次 out_channel = in_channel // 2 for _ in range(2): net += [ torch.nn.ConvTranspose2d(in_channel, out_channel, kernel_size=3, stride=2, padding=1, output_padding=1), torch.nn.InstanceNorm2d(out_channel), torch.nn.ReLU(inplace=True) ] in_channel = out_channel out_channel = in_channel // 2 # 輸出 net += [ torch.nn.ReflectionPad2d(3), torch.nn.Conv2d(in_channel, 3, kernel_size=7), torch.nn.Tanh() ] self.model = torch.nn.Sequential(*net) def forward(self, x): return self.model(x) class Discriminator(torch.nn.Module): '''判別器''' def __init__(self): super(Discriminator, self).__init__() model = [ torch.nn.Conv2d(3, 64, kernel_size=4, stride=2, padding=1), torch.nn.LeakyReLU(0.2, inplace=True) ] model += [ torch.nn.Conv2d(64, 128, kernel_size=4, stride=2, padding=1), torch.nn.InstanceNorm2d(128), torch.nn.LeakyReLU(0.2, inplace=True) ] model += [ torch.nn.Conv2d(128, 256, kernel_size=4, stride=2, padding=1), torch.nn.InstanceNorm2d(256), torch.nn.LeakyReLU(0.2, inplace=True) ] model += [ torch.nn.Conv2d(256, 512, kernel_size=4, stride=2, padding=1), torch.nn.InstanceNorm2d(512), torch.nn.LeakyReLU(0.2, inplace=True) ] model += [torch.nn.Conv2d(512, 1, kernel_size=4, padding=1)] self.model = torch.nn.Sequential(*model) def forward(self, x): x = self.model(x) return F.avg_pool2d(x, x.size()[2:]).view(x.size()[0], -1) if __name__ == '__main__': G = Generator() D = Discriminator() input_tensor = torch.ones((1, 3, 256, 256), dtype=torch.float) out = G(input_tensor) print(out.size()) out = D(input_tensor) print(out.size())
運行結果
torch.Size([1, 3, 256, 256])
torch.Size([1, 1])
工具類和方法
import random import torch import numpy as np def tensor2image(tensor): image = 127.5 * (tensor[0].cpu().float().numpy() + 1.0) if image.shape[0] == 1: image = np.tile(image, (3, 1, 1)) return image.astype(np.uint8) class ReplayBuffer(): def __init__(self, max_size=50): assert (max_size > 0), "Empty buffer or trying to create a black hole. Be careful." self.max_size = max_size self.data = [] def push_and_pop(self, data): to_return = [] for element in data.data: element = torch.unsqueeze(element, 0) if len(self.data) < self.max_size: self.data.append(element) to_return.append(element) else: if random.uniform(0, 1) > 0.5: i = random.randint(0, self.max_size - 1) to_return.append(self.data[i].clone()) self.data[i] = element else: to_return.append(element) return torch.cat(to_return) class LambdaLR(): def __init__(self, n_epochs, offset, decay_start_epoch): assert ((n_epochs - decay_start_epoch) > 0), "Decay must start before the training session ends!" self.n_epochs = n_epochs self.offset = offset self.decay_start_epoch = decay_start_epoch def step(self, epoch): return 1.0 - max(0, epoch + self.offset - self.decay_start_epoch) / (self.n_epochs - self.decay_start_epoch) def weights_init_normal(m): classname = m.__class__.__name__ if classname.find('Conv') != -1: torch.nn.init.normal(m.weight.data, 0.0, 0.02) elif classname.find('BatchNorm2d') != -1: torch.nn.init.normal(m.weight.data, 1.0, 0.02) torch.nn.init.constant(m.bias.data, 0.0)
模型訓練
import torchvision.transforms as transforms from torch.utils.data import DataLoader from PIL import Image import torch from pytorch.gan.models import Generator, Discriminator from pytorch.gan.utils import ReplayBuffer, LambdaLR, weights_init_normal from pytorch.gan.dataset import ImageDataset import itertools import tensorboardX import os if __name__ == '__main__': os.environ["OMP_NUM_THREADS"] = "1" device = torch.device("cuda" if torch.cuda.is_available() else "cpu") batchsize = 1 size = 256 lr = 0.0002 n_epoch = 200 epoch = 0 decay_epoch = 100 netG_A2B = Generator().to(device) netG_B2A = Generator().to(device) netD_A = Discriminator().to(device) netD_B = Discriminator().to(device) loss_GAN = torch.nn.MSELoss() loss_Cycle = torch.nn.L1Loss() loss_identity = torch.nn.L1Loss() opt_G = torch.optim.Adam(itertools.chain(netG_A2B.parameters(), netG_B2A.parameters()), lr=lr, betas=(0.5, 0.9999)) opt_DA = torch.optim.Adam(netD_A.parameters(), lr=lr, betas=(0.5, 0.9999)) opt_DB = torch.optim.Adam(netD_B.parameters(), lr=lr, betas=(0.5, 0.9999)) lr_scheduler_G = torch.optim.lr_scheduler.LambdaLR(opt_G, lr_lambda=LambdaLR(n_epoch, epoch, decay_epoch).step) lr_scheduler_DA = torch.optim.lr_scheduler.LambdaLR(opt_DA, lr_lambda=LambdaLR(n_epoch, epoch, decay_epoch).step) lr_scheduler_DB = torch.optim.lr_scheduler.LambdaLR(opt_DB, lr_lambda=LambdaLR(n_epoch, epoch, decay_epoch).step) data_root = "/Users/admin/Downloads/apple2orange" input_A = torch.ones([1, 3, size, size], dtype=torch.float).to(device) input_B = torch.ones([1, 3, size, size], dtype=torch.float).to(device) label_real = torch.ones([1], dtype=torch.float, requires_grad=False).to(device) label_fake = torch.zeros([1], dtype=torch.float, requires_grad=False).to(device) fake_A_buffer = ReplayBuffer() fake_B_Buffer = ReplayBuffer() log_path = "logs" writer_log = tensorboardX.SummaryWriter(log_path) transforms_ = [ transforms.Resize(int(256 * 1.12), Image.BICUBIC), transforms.RandomCrop(256), transforms.RandomHorizontalFlip(), transforms.ToTensor(), transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5)) ] dataloader = DataLoader(dataset=ImageDataset(data_root, transforms_), batch_size=batchsize, shuffle=True, num_workers=1) step = 0 for epoch in range(n_epoch): for i, batch in enumerate(dataloader): real_A = torch.tensor(input_A.copy_(batch['A']), dtype=torch.float).to(device) real_B = torch.tensor(input_B.copy_(batch['B']), dtype=torch.float).to(device) # 生成器梯度下降 opt_G.zero_grad() same_B = netG_A2B(real_B) loss_identity_B = loss_identity(same_B, real_B) * 5.0 same_A = netG_B2A(real_A) loss_identity_A = loss_identity(same_A, real_A) * 5.0 fake_B = netG_A2B(real_A) pred_fake = netD_B(fake_B) loss_GAN_A2B = loss_GAN(pred_fake, label_real) fake_A = netG_B2A(real_B) pred_fake = netD_A(fake_A) loss_GAN_B2A = loss_GAN(pred_fake, label_real) recovered_A = netG_B2A(fake_B) loss_cycle_ABA = loss_Cycle(recovered_A, real_A) * 10.0 recovered_B = netG_A2B(fake_A) loss_cycle_BAB = loss_Cycle(recovered_B, real_B) * 10.0 loss_G = loss_identity_A + loss_identity_B + loss_GAN_A2B + loss_GAN_B2A + \ loss_cycle_ABA + loss_cycle_BAB loss_G.backward() opt_G.step() # 判別器梯度下降 opt_DA.zero_grad() pred_real = netD_A(real_A) loss_D_real = loss_GAN(pred_real, label_real) fake_A = fake_A_buffer.push_and_pop(fake_A) pred_fake = netD_A(fake_A.detach()) loss_D_fake = loss_GAN(pred_real, label_fake) loss_D_A = (loss_D_real + loss_D_fake) * 0.5 loss_D_A.backward() opt_DA.step() opt_DB.zero_grad() pred_real = netD_B(real_B) loss_D_real = loss_GAN(pred_real, label_real) fake_B = fake_A_buffer.push_and_pop(fake_B) pred_fake = netD_B(fake_B.detach()) loss_D_fake = loss_GAN(pred_real, label_fake) loss_D_B = (loss_D_real + loss_D_fake) * 0.5 loss_D_B.backward() opt_DB.step() print("loss_G:{}, loss_G_identity:{}, loss_G_GAN:{}, " "loss_G_cycle:{}, loss_D_A:{}, loss_D_B:{}".format( loss_G, loss_identity_A + loss_identity_B, loss_GAN_A2B + loss_GAN_B2A, loss_cycle_ABA + loss_cycle_BAB, loss_D_A, loss_D_B )) writer_log.add_scalar("loss_G", loss_G, global_step=step + 1) writer_log.add_scalar("loss_G_identity", loss_identity_A + loss_identity_B, global_step=step + 1) writer_log.add_scalar("loss_G_GAN", loss_GAN_A2B + loss_GAN_B2A, global_step=step + 1) writer_log.add_scalar("loss_G_cycle", loss_cycle_ABA + loss_cycle_BAB, global_step=step + 1) writer_log.add_scalar("loss_D_A", loss_D_A, global_step=step + 1) writer_log.add_scalar("loss_D_B", loss_D_B, global_step=step + 1) step += 1 lr_scheduler_G.step() lr_scheduler_DA.step() lr_scheduler_DB.step() torch.save(netG_A2B.state_dict(), "models/netG_A2B.pth") torch.save(netG_B2A.state_dict(), "models/netG_B2A.pth") torch.save(netD_A.state_dict(), "models/netD_A.pth") torch.save(netD_B.state_dict(), "models/netD_B.pth")
模型開發與部署
AI平臺的開發部署一般分爲訓練平臺和部署平臺,開發平臺一般以英偉達的CUDA爲主,部署平臺分爲服務端部署和終端部署。
- 終端AI推理芯片,指的是一些類電腦的設備,如手機、汽車、攝像頭、IoT、衆多嵌入式設備
- Nvidia(英偉達):CUDA GPU,面向嵌入式的JETSON
- Intel:Movidius VPU(NCS2)
- Apple:A12處理器(及之後)上的NPU
- 高通:驍龍處理器
- 華爲:麒麟處理器(達芬奇架構)
- AI終端前向軟件框架
- 桌面級上使用的是PyTorch、Tensorflow
- iOS上使用的是Apple的CoreML、PyTorch庫等。
- Android上使用的是TFiit框架、PyTorch庫、NCNN庫等
- Intel NCS上使用的是Intel的NCSDK
- Nvidia嵌入式設備上使用的TensorRT
- 終端部署PyTorch模型
- PyTorch的C++接口官方包名爲LibTorch
- iOS:PyTorch->ONNX->CoreML->iOS
- Android:PyTorch->ONNX->ncnn->android或者PyTorch->ONNX->tensorflow->android
ONNX
Open Neural Network Exchange(開放神經網絡交換)格式,是一個用於表示深度學習模型的標準,可使模型在不同框架之間進行轉移。支持加載ONNX模型並進行推理的深度學習框架有:Caffe2、PyTorch、MXNet、ML.NET、TensorRT和Microsoft CNTK,TensorFlow非官方支持。
可視化工具:netron
pip install netron
- PyTorch轉ONNX
pip install cython protobuf numpy
sudo apt-get install libprotobuf-dev protobuf-compiler
pip install onnx
- 如何正確的導出onnx
- 對於任何用到shape、size返回值的參數時,例如:tensor.view(tensor.size(0),-1)這類操作,避免直接使用tensor.size的返回值,而是加上int轉換。tensor.view(int(tensor.size(0)),-1)
- 對於nn.Upsamle或nn.functional.interpolate函數,使用scale_factor指定倍率,而不是使用size參數指定大小
- 對於reshape、view操作時,-1的指定請放到batch維度。其他維度可以計算出來即可。batch維度禁止指定爲大於-1的明確數字。
- torch.onnx.export指定dynamic_axes參數,並且只指定batch維度,不指定其他維度。我們只需要動態batch,相對動態的寬高有其他方案
這些做法的必要性體現在,簡化過程的複雜度,去掉gather、shape類的節點,很多時候,部分不這麼改看似也是可以但是需求複雜後,依舊存在各類問題。
YOLOV5的部署
YOLOV5的github地址:https://github.com/ultralytics/YOLOV5
代碼拉取完成後進入yolov5-master文件夾,執行
python export.py --include=onnx
此時我們可以看到文件夾下面多了兩個文件——yolov5s.onnx、yolov5s.pt
執行
(base) admindeMBP:yolov5-master admin$ netron
Serving at http://localhost:8080
打開瀏覽器進入http://127.0.0.1:8080/,選擇打開yolov5s.onnx,可以看到該模型的可視化界面
這裏我們看一下models文件夾下的yolo.py,由於之前說對於任何用到shape、size返回值的參數時,避免直接使用tensor.size的返回值,而是加上int轉換,這裏我們修改如下
第53行修改爲
bs, _, ny, nx = map(int, x[i].shape)
由於對於reshape、view操作時,-1的指定請放到batch維度。其他維度可以計算出來即可。batch維度禁止指定爲大於-1的明確數字。68行修改爲
z.append(y.view(-1, int(y.size(1) * y.size(2) * y.size(3)), self.no))
再次執行
python export.py --include=onnx
使用可視化工具打開yolov5s.onnx,此時我們點中reshape節點,可以看見
TensorRT
TensorRT是英偉達的一個終端前向推理框架,是針對nvidia系列硬件進行優化加速,實現最大程度的利用GPU資源,提升推理性能。
框架下載地址:https://github.com/shouxieai/tensorRT_Pro
我們先簡單寫一個Pytorch模型,然後導出成onnx
import torch import torch.nn as nn class Model(nn.Module): def __init__(self): super(Model, self).__init__() self.conv = nn.Conv2d(1, 1, 3, stride=1, padding=1, bias=True) self.conv.weight.data.fill_(0.3) self.conv.bias.data.fill_(0.2) def forward(self, x): x = self.conv(x) return x.view(x.size(0), -1) if __name__ == '__main__': model = Model().eval() x = torch.full((1, 1, 3, 3), 1.0) y = model(x) torch.onnx.export(model, (x,), "onnx1.onnx", verbose=True)
程序運行後,會得到一個onnx1.onnx的文件,我們使用netron打開可以看到