『Pytorch笔记3』Pytorch的Broadcast,合并与分割,数学运算,属性统计以及高阶操作！

Pytorch的Broadcast,合并与分割,数学运算,属性统计以及高阶操作！

文章目录

三. 数学运算

四. 统计属性

一. Broadcast广播机制

这里： Broadcast它能维度扩展和expand一样，它是自动扩展，并且不需要拷贝数据，能够节省内存。关键思想：

Broadcast存在的意义： ①实际的扩展。②节省内存资源。当没有维度的时候，首先添加一个size=1的维度，然后对size=1的所有维度进行扩展。

import torch

a = torch.rand(4, 32, 14, 14)
b = torch.rand(1, 32, 1, 1)
c = torch.rand(32, 1, 1)

# b [1, 32, 1, 1]=>[4, 32, 14, 14]
print((a + b).shape)
print((a+c).shape)

运行结果：

torch.Size([4, 32, 14, 14])
torch.Size([4, 32, 14, 14])

Process finished with exit code 0

二. 合并与分割(merge or split)

2.1. cat拼接

import torch

# 两个班级a和b,各有32个学生，8门成绩。
a = torch.rand(4, 32, 8)
b = torch.rand(5, 32, 8)
# 按照班级进行合并起来。
print(torch.cat([a, b], dim=0).shape)

运行结果：

torch.Size([9, 32, 8])

Process finished with exit code 0

import torch

a1 = torch.rand(4, 3, 32, 32)
a2 = torch.rand(5, 3, 32, 32)
print(torch.cat([a1, a2], dim=0).shape)
print('====================================')
a3 = torch.rand(4, 1, 32, 32)
# print(torch.cat([a1, a3], dim=0))      # 这句报错。
print(torch.cat([a1, a3], dim=1).shape)

运行结果：

torch.Size([9, 3, 32, 32])
====================================
torch.Size([4, 4, 32, 32])

Process finished with exit code 0

2.2. stack创建新维度

import torch

a1 = torch.rand(4, 3, 32, 32)
a2 = torch.rand(4, 3, 32, 32)
print(torch.cat([a1, a2], dim=1).shape)
print('====================================')
print(torch.stack([a1, a2], dim=1).shape)    # 各自创建一个新的维度。然后concat

a = torch.rand(32, 8)
b = torch.rand(32, 8)
print(torch.stack([a, b], dim=0).shape)

运行结果：

torch.Size([4, 6, 32, 32])
====================================
torch.Size([4, 2, 3, 32, 32])                # 各自创建一个新的维度。然后concat
torch.Size([2, 32, 8])

Process finished with exit code 0

这里： 具体的应用比如有a,b两个班级各有60个学生，8门成绩。维度表示为[60, 8]，现在把这2个班级的成绩和成一张表。如果cat起来为[120, 8]，显然不合适。用stack合并起来变为[2, 60, 8]显然合适。这里也说明stack操作两个维度必须一致。

2.3. split按长度拆分和chunk按数量拆分

这里： .split(长度，dim)第一参数表示拆分后的长度，第二个参数表示要拆分的维度。

import torch

c = torch.rand(2, 32, 8)
aa, bb = c.split([1, 1], dim=0)
print(aa.shape)
print(bb.shape)
print('====================================')
aa, bb = c.split(1, dim=0)
print(aa.shape)
print(bb.shape)

运行结果

torch.Size([1, 32, 8])
torch.Size([1, 32, 8])
====================================
torch.Size([1, 32, 8])
torch.Size([1, 32, 8])

Process finished with exit code 0

这里： .chunk(数量，dim)第一参数表示要拆分后的数量，第二个参数表示要拆分的维度。

import torch

c = torch.rand(8, 32, 8)
aa, bb = c.chunk(2, dim=0)  # 第1个参数要拆分后的数量
print(aa.shape)
print(bb.shape)

运行结果

torch.Size([4, 32, 8])
torch.Size([4, 32, 8])

Process finished with exit code 0

三. 数学运算

3.1. add/sub/mul/div加减乘除

import torch

a = torch.rand(3, 4)
b = torch.rand(4)

print(a+b)
print(torch.add(a, b))

print(torch.all(torch.eq(a-b, torch.sub(a, b))))
print(torch.all(torch.eq(a*b, torch.mul(a, b))))
print(torch.all(torch.eq(a/b, torch.div(a, b))))

运行结果

tensor([[0.5039, 1.2329, 1.4820, 0.7634],
        [1.1962, 1.2740, 1.1871, 0.6491],
        [1.0346, 1.0578, 0.9915, 0.7993]])
tensor([[0.5039, 1.2329, 1.4820, 0.7634],
        [1.1962, 1.2740, 1.1871, 0.6491],
        [1.0346, 1.0578, 0.9915, 0.7993]])
tensor(True)
tensor(True)
tensor(True)

Process finished with exit code 0

3.2. matmul矩阵相乘

import torch

a = torch.tensor([[3., 3.], [3., 3.]])
b = torch.ones(2, 2)
print(torch.mm(a, b))
print(torch.matmul(a, b))
print(a@b)

运行结果

tensor([[6., 6.],
        [6., 6.]])
tensor([[6., 6.],
        [6., 6.]])
tensor([[6., 6.],
        [6., 6.]])

Process finished with exit code 0

这里： 上面的相乘是针对2D的tensor，那么对于3D和4D的tensor如何mat呢？神经网络中的图片一般都是2D的，NLP中的文本一般都是3D和4D的。如何定义这些矩阵相乘呢？下面的例子展示。

import torch

a = torch.rand(4, 3, 28, 64)
b = torch.rand(4, 3, 64, 32)

# 这就是4D的tensor矩阵相乘，这种规则是符合实际规则的。
# 这其实就是支持多个矩阵对并行相乘。
# 只取低的维度(右边)参与运算，就是[28,64]@[64,32]
print(torch.matmul(a, b).shape)

print('================================')
c = torch.rand(4, 1, 64, 32) # 这里使用使用broadcasting机制，把dim的size为1的变为一致。
print(torch.matmul(a, c).shape)

运行结果：

torch.Size([4, 3, 28, 32])
================================
torch.Size([4, 3, 28, 32])

Process finished with exit code 0

3.3. pow矩阵的次方以及sqrt/rsqrt/exp/log

这里： pow(tensor, 次方)第一个参数为Tensor，第二个参数表示次方，比如2次方，三次方，四次方等等。

import torch

a =torch.full([2, 2], 3) # 使用torch.full函数创建一个shape[2, 2],元素全部为3的张量
print(a.pow(2))
print(torch.pow(a, 2))
print(a**2)
print('=============================')
b = a**2
print(b.sqrt())
print(b.rsqrt())         # 平方根的导数

运行结果

tensor([[9., 9.],
        [9., 9.]])
tensor([[9., 9.],
        [9., 9.]])
tensor([[9., 9.],
        [9., 9.]])
=============================
tensor([[3., 3.],
        [3., 3.]])
tensor([[0.3333, 0.3333],
        [0.3333, 0.3333]])

Process finished with exit code 0

exp/log

import torch

a = torch.exp(torch.ones(2, 2))
print(a)
print(torch.log(a))     # 默认以e为底,使用2为底或者其他的，自己设置.

运行结果

tensor([[2.7183, 2.7183],
        [2.7183, 2.7183]])
tensor([[1., 1.],
        [1., 1.]])

Process finished with exit code 0

3.5. round矩阵近似运算

import torch

a = torch.tensor(3.14)
# .floor()向下取整，.ceil()向上取整，.trunc()截取整数，.frac截取小数。
print(a.floor(), a.ceil(), a.trunc(), a.frac())

print(a.round())
b = torch.tensor(3.5)
print(b.round())

运行结果

tensor(3.) tensor(4.) tensor(3.) tensor(0.1400)
tensor(3.)
tensor(4.)

Process finished with exit code 0

3.6. clamp(裁剪)用的多

这里： 主要用在梯度裁剪里面，梯度离散(不需要从网络层面解决，因为梯度非常小，接近0)和梯度爆炸(梯度非常大,100已经算是大的了)。因此在网络训练不稳定的时候，可以打印一下梯度的模看看，w.grad.norm(2)表示梯度的二范数(一般100,1000已经算是大的了，一般10以内算是合适的)。

a.clamp(min):表示tensor a中小于10的都赋值为10，表示最小值为10；

import torch

grad = torch.rand(2, 3)*15
print(grad)
print(grad.max(), grad.median(), grad.min())
print('============================================')
print(grad.clamp(10))     # 最小值限定为10，小于10的都变为10；

print(grad.clamp(8, 15))
print(torch.clamp(grad, 8, 15))

运行结果

tensor([[11.0328,  4.9081,  2.3248],
        [11.3747,  3.9017, 11.5049]])
tensor(11.5049) tensor(4.9081) tensor(2.3248)
============================================
tensor([[11.0328, 10.0000, 10.0000],
        [11.3747, 10.0000, 11.5049]])
tensor([[11.0328,  8.0000,  8.0000],
        [11.3747,  8.0000, 11.5049]])
tensor([[11.0328,  8.0000,  8.0000],
        [11.3747,  8.0000, 11.5049]])

Process finished with exit code 0

四. 统计属性

4.1. norm范数,prod张量元素累乘(阶乘)

这里： 先参考一下我之前的博客，向量范数和矩阵范数的定义。2.2. 向量范数矩阵范数

import torch

a = torch.full([8], 1)
b = a.view(2, 4)
c = a.view(2, 2, 2)
print(a, '\n', b,'\n', c)
print('=============================================')
print(a.norm(1), b.norm(1), c.norm(1))
print(a.norm(2), b.norm(2), c.norm(2))
print('=============================================')
print(b.norm(1, dim=1))
print(b.norm(2, dim=1))
print('=============================================')
print(c.norm(1, dim=0))
print(c.norm(2, dim=0))
print(torch.norm(c, p=2, dim=0)) # 同一个表达，p=2可以省略，默认就是2

运行结果

tensor([1., 1., 1., 1., 1., 1., 1., 1.]) 
 tensor([[1., 1., 1., 1.],
        [1., 1., 1., 1.]]) 
 tensor([[[1., 1.],
         [1., 1.]],

        [[1., 1.],
         [1., 1.]]])
=============================================
tensor(8.) tensor(8.) tensor(8.)
tensor(2.8284) tensor(2.8284) tensor(2.8284)
=============================================
tensor([4., 4.])
tensor([2., 2.])
=============================================
tensor([[2., 2.],
        [2., 2.]])
tensor([[1.4142, 1.4142],
        [1.4142, 1.4142]])
tensor([[1.4142, 1.4142],
        [1.4142, 1.4142]])
        
Process finished with exit code 0

4.2. mean/sum/max/min/argmin/argmax

import torch

a = torch.rand(2, 4)
print(a)
print(a.max(), a.min(), a.mean())
print(a.prod()) # 最大值，最小值，均值，prod表示累乘也就是阶乘。
print(a.sum())  # 累加操作。

print(a.argmax(), a.argmin())

运行结果

tensor([[0.4677, 0.8331, 0.4240, 0.9348],
        [0.0192, 0.2354, 0.9979, 0.0077]])
tensor(0.9979) tensor(0.0077) tensor(0.4900)
tensor(5.3340e-06)
tensor(3.9197)
tensor(6) tensor(7)

Process finished with exit code 0

这里： 从结果中我们可以发现，min/max/argmin/argmax这些函数首先把Tensor打平成一维的Tensor，因此上面的argmin/argmax才会得到那样的结果。

import torch

a = torch.rand(4, 5)
print(a)
print(a.max(dim=1))                 # 得到的shape为：[4], [4]
print('==============================')
# keepdim=True就是维度保持一致。
print(a.max(dim=1, keepdim=True))   # 有时候为了shape还为：[4,1], [4,1]

运行结果

tensor([[0.0956, 0.1968, 0.2054, 0.3631, 0.5661],
        [0.8228, 0.9709, 0.1276, 0.2207, 0.5825],
        [0.7764, 0.2675, 0.1439, 0.3109, 0.6960],
        [0.7047, 0.5668, 0.3775, 0.6214, 0.0674]])
torch.return_types.max(
values=tensor([0.5661, 0.9709, 0.7764, 0.7047]),
indices=tensor([4, 1, 0, 0]))
==============================
torch.return_types.max(
values=tensor([[0.5661],
        [0.9709],
        [0.7764],
        [0.7047]]),
indices=tensor([[4],
        [1],
        [0],
        [0]]))

Process finished with exit code 0

4.3. kthvalue()和topk()

这里： topk(3, dim=1)(最大的3个)返回结果如下图所示，如果把largest设置为False就是默认最小的几个。
这里： kthvalue(k，dim=1)表示第k小的(默认表示小的)。下面图中的一共10中可能，第8小就是表示第3大。

import torch

a = torch.rand(5, 10)
print(a.topk(3, dim=1))       # 最大的3个元素，和对应的index
print('==========================================================')
print(a.topk(3, dim=1, largest=False))  # 最小的3个元素，和对应的index
print('==========================================================')
print(a.kthvalue(3))
print(a.kthvalue(3,dim=1))

运行结果

torch.return_types.topk(
values=tensor([[0.9644, 0.8750, 0.8059],
        [0.9445, 0.9039, 0.8314],
        [0.9025, 0.8567, 0.8550],
        [0.9710, 0.8377, 0.8066],
        [0.8984, 0.8439, 0.8386]]),
indices=tensor([[5, 7, 2],
        [3, 6, 4],
        [1, 5, 0],
        [3, 9, 6],
        [3, 8, 2]]))
==========================================================
torch.return_types.topk(
values=tensor([[0.0790, 0.2262, 0.3413],
        [0.1071, 0.1207, 0.1217],
        [0.2904, 0.3274, 0.3424],
        [0.1910, 0.2919, 0.5602],
        [0.2474, 0.2730, 0.6032]]),
indices=tensor([[6, 8, 3],
        [2, 5, 0],
        [2, 4, 3],
        [4, 2, 7],
        [4, 0, 7]]))
==========================================================
torch.return_types.kthvalue(
values=tensor([0.3413, 0.1217, 0.3424, 0.5602, 0.6032]),
indices=tensor([3, 0, 3, 7, 7]))
torch.return_types.kthvalue(
values=tensor([0.3413, 0.1217, 0.3424, 0.5602, 0.6032]),
indices=tensor([3, 0, 3, 7, 7]))

Process finished with exit code 0

4.4. 比较运算符号>,>=,<,<=,!=,==

greater than表示大于等于。equal表示等于eq。

import torch

a = torch.rand(5, 5)
print(a>0.2)
print(torch.gt(a, 0.2))
print(a!=0)

运行结果

tensor([[False,  True,  True,  True,  True],
        [ True,  True,  True,  True,  True],
        [ True,  True,  True,  True, False],
        [ True,  True, False,  True,  True],
        [ True,  True,  True,  True,  True]])
tensor([[False,  True,  True,  True,  True],
        [ True,  True,  True,  True,  True],
        [ True,  True,  True,  True, False],
        [ True,  True, False,  True,  True],
        [ True,  True,  True,  True,  True]])
tensor([[True, True, True, True, True],
        [True, True, True, True, True],
        [True, True, True, True, True],
        [True, True, True, True, True],
        [True, True, True, True, True]])

Process finished with exit code 0

『Pytorch笔记3』Pytorch的Broadcast,合并与分割,数学运算,属性统计以及高阶操作！

文章目录

一. Broadcast广播机制

二. 合并与分割(merge or split)

2.1. cat拼接

2.2. stack创建新维度

2.3. split按长度拆分和chunk按数量拆分

三. 数学运算

3.1. add/sub/mul/div加减乘除

3.2. matmul矩阵相乘

3.3. pow矩阵的次方以及sqrt/rsqrt/exp/log

3.5. round矩阵近似运算

3.6. clamp(裁剪)用的多

四. 统计属性

4.1. norm范数,prod张量元素累乘(阶乘)

4.2. mean/sum/max/min/argmin/argmax

4.3. kthvalue()和topk()

4.4. 比较运算符号>,>=,<,<=,!=,==

10分钟搞定Mysql主从部署配置

如何使用 JS 判断用户是否处于活跃状态

一键自动化博客发布工具,用过的人都说好(掘金篇)

「Pygors跨平台GUI」2：安装MinGW-w64、MSYS2还是WSL2

[转帖]

python列出centos7内存使用前50的进程信息

「Pygors跨平台GUI」1：Pygors跨平台GUI应用研究

Java ThreadPoolShutdown

“她”来了，陪伴赛道巨变！为GPT-4o加上你的一个数字分身

nodejs学习06——小案例

『論文筆記』CBAM:Convolutional Block Attention Module(注意力機制)+TensorFlow2.0復現！

『自己的工作4』TensorFlow2.0自動微分和手工求導的結果對比！

〖TensorFlow2.0筆記23〗TensorFlow2.0學習筆記總結!

矩陣論筆記：詳細介紹矩陣的三角分解(LR分解)+平方根分解(Cholesky分解)！

『論文筆記』ACNet: Strengthening the Kernel Skeletons for Powerful CNN via Asymmetric Convolution Blocks！

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結