『Pytorch筆記3』Pytorch的Broadcast,合併與分割,數學運算,屬性統計以及高階操作！

Pytorch的Broadcast,合併與分割,數學運算,屬性統計以及高階操作！

文章目錄

三. 數學運算

四. 統計屬性

一. Broadcast廣播機制

這裏： Broadcast它能維度擴展和expand一樣，它是自動擴展，並且不需要拷貝數據，能夠節省內存。關鍵思想：

Broadcast存在的意義： ①實際的擴展。②節省內存資源。當沒有維度的時候，首先添加一個size=1的維度，然後對size=1的所有維度進行擴展。

import torch

a = torch.rand(4, 32, 14, 14)
b = torch.rand(1, 32, 1, 1)
c = torch.rand(32, 1, 1)

# b [1, 32, 1, 1]=>[4, 32, 14, 14]
print((a + b).shape)
print((a+c).shape)

運行結果：

torch.Size([4, 32, 14, 14])
torch.Size([4, 32, 14, 14])

Process finished with exit code 0

二. 合併與分割(merge or split)

2.1. cat拼接

import torch

# 兩個班級a和b,各有32個學生，8門成績。
a = torch.rand(4, 32, 8)
b = torch.rand(5, 32, 8)
# 按照班級進行合併起來。
print(torch.cat([a, b], dim=0).shape)

運行結果：

torch.Size([9, 32, 8])

Process finished with exit code 0

import torch

a1 = torch.rand(4, 3, 32, 32)
a2 = torch.rand(5, 3, 32, 32)
print(torch.cat([a1, a2], dim=0).shape)
print('====================================')
a3 = torch.rand(4, 1, 32, 32)
# print(torch.cat([a1, a3], dim=0))      # 這句報錯。
print(torch.cat([a1, a3], dim=1).shape)

運行結果：

torch.Size([9, 3, 32, 32])
====================================
torch.Size([4, 4, 32, 32])

Process finished with exit code 0

2.2. stack創建新維度

import torch

a1 = torch.rand(4, 3, 32, 32)
a2 = torch.rand(4, 3, 32, 32)
print(torch.cat([a1, a2], dim=1).shape)
print('====================================')
print(torch.stack([a1, a2], dim=1).shape)    # 各自創建一個新的維度。然後concat

a = torch.rand(32, 8)
b = torch.rand(32, 8)
print(torch.stack([a, b], dim=0).shape)

運行結果：

torch.Size([4, 6, 32, 32])
====================================
torch.Size([4, 2, 3, 32, 32])                # 各自創建一個新的維度。然後concat
torch.Size([2, 32, 8])

Process finished with exit code 0

這裏： 具體的應用比如有a,b兩個班級各有60個學生，8門成績。維度表示爲[60, 8]，現在把這2個班級的成績和成一張表。如果cat起來爲[120, 8]，顯然不合適。用stack合併起來變爲[2, 60, 8]顯然合適。這裏也說明stack操作兩個維度必須一致。

2.3. split按長度拆分和chunk按數量拆分

這裏： .split(長度，dim)第一參數表示拆分後的長度，第二個參數表示要拆分的維度。

import torch

c = torch.rand(2, 32, 8)
aa, bb = c.split([1, 1], dim=0)
print(aa.shape)
print(bb.shape)
print('====================================')
aa, bb = c.split(1, dim=0)
print(aa.shape)
print(bb.shape)

運行結果

torch.Size([1, 32, 8])
torch.Size([1, 32, 8])
====================================
torch.Size([1, 32, 8])
torch.Size([1, 32, 8])

Process finished with exit code 0

這裏： .chunk(數量，dim)第一參數表示要拆分後的數量，第二個參數表示要拆分的維度。

import torch

c = torch.rand(8, 32, 8)
aa, bb = c.chunk(2, dim=0)  # 第1個參數要拆分後的數量
print(aa.shape)
print(bb.shape)

運行結果

torch.Size([4, 32, 8])
torch.Size([4, 32, 8])

Process finished with exit code 0

三. 數學運算

3.1. add/sub/mul/div加減乘除

import torch

a = torch.rand(3, 4)
b = torch.rand(4)

print(a+b)
print(torch.add(a, b))

print(torch.all(torch.eq(a-b, torch.sub(a, b))))
print(torch.all(torch.eq(a*b, torch.mul(a, b))))
print(torch.all(torch.eq(a/b, torch.div(a, b))))

運行結果

tensor([[0.5039, 1.2329, 1.4820, 0.7634],
        [1.1962, 1.2740, 1.1871, 0.6491],
        [1.0346, 1.0578, 0.9915, 0.7993]])
tensor([[0.5039, 1.2329, 1.4820, 0.7634],
        [1.1962, 1.2740, 1.1871, 0.6491],
        [1.0346, 1.0578, 0.9915, 0.7993]])
tensor(True)
tensor(True)
tensor(True)

Process finished with exit code 0

3.2. matmul矩陣相乘

import torch

a = torch.tensor([[3., 3.], [3., 3.]])
b = torch.ones(2, 2)
print(torch.mm(a, b))
print(torch.matmul(a, b))
print(a@b)

運行結果

tensor([[6., 6.],
        [6., 6.]])
tensor([[6., 6.],
        [6., 6.]])
tensor([[6., 6.],
        [6., 6.]])

Process finished with exit code 0

這裏： 上面的相乘是針對2D的tensor，那麼對於3D和4D的tensor如何mat呢？神經網絡中的圖片一般都是2D的，NLP中的文本一般都是3D和4D的。如何定義這些矩陣相乘呢？下面的例子展示。

import torch

a = torch.rand(4, 3, 28, 64)
b = torch.rand(4, 3, 64, 32)

# 這就是4D的tensor矩陣相乘，這種規則是符合實際規則的。
# 這其實就是支持多個矩陣對並行相乘。
# 只取低的維度(右邊)參與運算，就是[28,64]@[64,32]
print(torch.matmul(a, b).shape)

print('================================')
c = torch.rand(4, 1, 64, 32) # 這裏使用使用broadcasting機制，把dim的size爲1的變爲一致。
print(torch.matmul(a, c).shape)

運行結果：

torch.Size([4, 3, 28, 32])
================================
torch.Size([4, 3, 28, 32])

Process finished with exit code 0

3.3. pow矩陣的次方以及sqrt/rsqrt/exp/log

這裏： pow(tensor, 次方)第一個參數爲Tensor，第二個參數表示次方，比如2次方，三次方，四次方等等。

import torch

a =torch.full([2, 2], 3) # 使用torch.full函數創建一個shape[2, 2],元素全部爲3的張量
print(a.pow(2))
print(torch.pow(a, 2))
print(a**2)
print('=============================')
b = a**2
print(b.sqrt())
print(b.rsqrt())         # 平方根的導數

運行結果

tensor([[9., 9.],
        [9., 9.]])
tensor([[9., 9.],
        [9., 9.]])
tensor([[9., 9.],
        [9., 9.]])
=============================
tensor([[3., 3.],
        [3., 3.]])
tensor([[0.3333, 0.3333],
        [0.3333, 0.3333]])

Process finished with exit code 0

exp/log

import torch

a = torch.exp(torch.ones(2, 2))
print(a)
print(torch.log(a))     # 默認以e爲底,使用2爲底或者其他的，自己設置.

運行結果

tensor([[2.7183, 2.7183],
        [2.7183, 2.7183]])
tensor([[1., 1.],
        [1., 1.]])

Process finished with exit code 0

3.5. round矩陣近似運算

import torch

a = torch.tensor(3.14)
# .floor()向下取整，.ceil()向上取整，.trunc()截取整數，.frac截取小數。
print(a.floor(), a.ceil(), a.trunc(), a.frac())

print(a.round())
b = torch.tensor(3.5)
print(b.round())

運行結果

tensor(3.) tensor(4.) tensor(3.) tensor(0.1400)
tensor(3.)
tensor(4.)

Process finished with exit code 0

3.6. clamp(裁剪)用的多

這裏： 主要用在梯度裁剪裏面，梯度離散(不需要從網絡層面解決，因爲梯度非常小，接近0)和梯度爆炸(梯度非常大,100已經算是大的了)。因此在網絡訓練不穩定的時候，可以打印一下梯度的模看看，w.grad.norm(2)表示梯度的二範數(一般100,1000已經算是大的了，一般10以內算是合適的)。

a.clamp(min):表示tensor a中小於10的都賦值爲10，表示最小值爲10；

import torch

grad = torch.rand(2, 3)*15
print(grad)
print(grad.max(), grad.median(), grad.min())
print('============================================')
print(grad.clamp(10))     # 最小值限定爲10，小於10的都變爲10；

print(grad.clamp(8, 15))
print(torch.clamp(grad, 8, 15))

運行結果

tensor([[11.0328,  4.9081,  2.3248],
        [11.3747,  3.9017, 11.5049]])
tensor(11.5049) tensor(4.9081) tensor(2.3248)
============================================
tensor([[11.0328, 10.0000, 10.0000],
        [11.3747, 10.0000, 11.5049]])
tensor([[11.0328,  8.0000,  8.0000],
        [11.3747,  8.0000, 11.5049]])
tensor([[11.0328,  8.0000,  8.0000],
        [11.3747,  8.0000, 11.5049]])

Process finished with exit code 0

四. 統計屬性

4.1. norm範數,prod張量元素累乘(階乘)

這裏： 先參考一下我之前的博客，向量範數和矩陣範數的定義。2.2. 向量範數矩陣範數

import torch

a = torch.full([8], 1)
b = a.view(2, 4)
c = a.view(2, 2, 2)
print(a, '\n', b,'\n', c)
print('=============================================')
print(a.norm(1), b.norm(1), c.norm(1))
print(a.norm(2), b.norm(2), c.norm(2))
print('=============================================')
print(b.norm(1, dim=1))
print(b.norm(2, dim=1))
print('=============================================')
print(c.norm(1, dim=0))
print(c.norm(2, dim=0))
print(torch.norm(c, p=2, dim=0)) # 同一個表達，p=2可以省略，默認就是2

運行結果

tensor([1., 1., 1., 1., 1., 1., 1., 1.]) 
 tensor([[1., 1., 1., 1.],
        [1., 1., 1., 1.]]) 
 tensor([[[1., 1.],
         [1., 1.]],

        [[1., 1.],
         [1., 1.]]])
=============================================
tensor(8.) tensor(8.) tensor(8.)
tensor(2.8284) tensor(2.8284) tensor(2.8284)
=============================================
tensor([4., 4.])
tensor([2., 2.])
=============================================
tensor([[2., 2.],
        [2., 2.]])
tensor([[1.4142, 1.4142],
        [1.4142, 1.4142]])
tensor([[1.4142, 1.4142],
        [1.4142, 1.4142]])
        
Process finished with exit code 0

4.2. mean/sum/max/min/argmin/argmax

import torch

a = torch.rand(2, 4)
print(a)
print(a.max(), a.min(), a.mean())
print(a.prod()) # 最大值，最小值，均值，prod表示累乘也就是階乘。
print(a.sum())  # 累加操作。

print(a.argmax(), a.argmin())

運行結果

tensor([[0.4677, 0.8331, 0.4240, 0.9348],
        [0.0192, 0.2354, 0.9979, 0.0077]])
tensor(0.9979) tensor(0.0077) tensor(0.4900)
tensor(5.3340e-06)
tensor(3.9197)
tensor(6) tensor(7)

Process finished with exit code 0

這裏： 從結果中我們可以發現，min/max/argmin/argmax這些函數首先把Tensor打平成一維的Tensor，因此上面的argmin/argmax纔會得到那樣的結果。

import torch

a = torch.rand(4, 5)
print(a)
print(a.max(dim=1))                 # 得到的shape爲：[4], [4]
print('==============================')
# keepdim=True就是維度保持一致。
print(a.max(dim=1, keepdim=True))   # 有時候爲了shape還爲：[4,1], [4,1]

運行結果

tensor([[0.0956, 0.1968, 0.2054, 0.3631, 0.5661],
        [0.8228, 0.9709, 0.1276, 0.2207, 0.5825],
        [0.7764, 0.2675, 0.1439, 0.3109, 0.6960],
        [0.7047, 0.5668, 0.3775, 0.6214, 0.0674]])
torch.return_types.max(
values=tensor([0.5661, 0.9709, 0.7764, 0.7047]),
indices=tensor([4, 1, 0, 0]))
==============================
torch.return_types.max(
values=tensor([[0.5661],
        [0.9709],
        [0.7764],
        [0.7047]]),
indices=tensor([[4],
        [1],
        [0],
        [0]]))

Process finished with exit code 0

4.3. kthvalue()和topk()

這裏： topk(3, dim=1)(最大的3個)返回結果如下圖所示，如果把largest設置爲False就是默認最小的幾個。
這裏： kthvalue(k，dim=1)表示第k小的(默認表示小的)。下面圖中的一共10中可能，第8小就是表示第3大。

import torch

a = torch.rand(5, 10)
print(a.topk(3, dim=1))       # 最大的3個元素，和對應的index
print('==========================================================')
print(a.topk(3, dim=1, largest=False))  # 最小的3個元素，和對應的index
print('==========================================================')
print(a.kthvalue(3))
print(a.kthvalue(3,dim=1))

運行結果

torch.return_types.topk(
values=tensor([[0.9644, 0.8750, 0.8059],
        [0.9445, 0.9039, 0.8314],
        [0.9025, 0.8567, 0.8550],
        [0.9710, 0.8377, 0.8066],
        [0.8984, 0.8439, 0.8386]]),
indices=tensor([[5, 7, 2],
        [3, 6, 4],
        [1, 5, 0],
        [3, 9, 6],
        [3, 8, 2]]))
==========================================================
torch.return_types.topk(
values=tensor([[0.0790, 0.2262, 0.3413],
        [0.1071, 0.1207, 0.1217],
        [0.2904, 0.3274, 0.3424],
        [0.1910, 0.2919, 0.5602],
        [0.2474, 0.2730, 0.6032]]),
indices=tensor([[6, 8, 3],
        [2, 5, 0],
        [2, 4, 3],
        [4, 2, 7],
        [4, 0, 7]]))
==========================================================
torch.return_types.kthvalue(
values=tensor([0.3413, 0.1217, 0.3424, 0.5602, 0.6032]),
indices=tensor([3, 0, 3, 7, 7]))
torch.return_types.kthvalue(
values=tensor([0.3413, 0.1217, 0.3424, 0.5602, 0.6032]),
indices=tensor([3, 0, 3, 7, 7]))

Process finished with exit code 0

4.4. 比較運算符號>,>=,<,<=,!=,==

greater than表示大於等於。equal表示等於eq。

import torch

a = torch.rand(5, 5)
print(a>0.2)
print(torch.gt(a, 0.2))
print(a!=0)

運行結果

tensor([[False,  True,  True,  True,  True],
        [ True,  True,  True,  True,  True],
        [ True,  True,  True,  True, False],
        [ True,  True, False,  True,  True],
        [ True,  True,  True,  True,  True]])
tensor([[False,  True,  True,  True,  True],
        [ True,  True,  True,  True,  True],
        [ True,  True,  True,  True, False],
        [ True,  True, False,  True,  True],
        [ True,  True,  True,  True,  True]])
tensor([[True, True, True, True, True],
        [True, True, True, True, True],
        [True, True, True, True, True],
        [True, True, True, True, True],
        [True, True, True, True, True]])

Process finished with exit code 0

『Pytorch筆記3』Pytorch的Broadcast,合併與分割,數學運算,屬性統計以及高階操作！

文章目錄

一. Broadcast廣播機制

二. 合併與分割(merge or split)

2.1. cat拼接

2.2. stack創建新維度

2.3. split按長度拆分和chunk按數量拆分

三. 數學運算

3.1. add/sub/mul/div加減乘除

3.2. matmul矩陣相乘

3.3. pow矩陣的次方以及sqrt/rsqrt/exp/log

3.5. round矩陣近似運算

3.6. clamp(裁剪)用的多

四. 統計屬性

4.1. norm範數,prod張量元素累乘(階乘)

4.2. mean/sum/max/min/argmin/argmax

4.3. kthvalue()和topk()

4.4. 比較運算符號>,>=,<,<=,!=,==

『論文筆記』CBAM:Convolutional Block Attention Module(注意力機制)+TensorFlow2.0復現！

『自己的工作4』TensorFlow2.0自動微分和手工求導的結果對比！

〖TensorFlow2.0筆記23〗TensorFlow2.0學習筆記總結!

矩陣論筆記：詳細介紹矩陣的三角分解(LR分解)+平方根分解(Cholesky分解)！

『論文筆記』ACNet: Strengthening the Kernel Skeletons for Powerful CNN via Asymmetric Convolution Blocks！

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結