5、nn.L1Loss

功能：計算inputs與target之差的絕對值；
主要參數：reduction：計算模式，可爲none/sum/mean；none是逐個元素計算，sum是所有元素求和，返回標量；mean是加權平均，返回標量；
計算公式： $l_{n}=\left|x_{n}-y_{n}\right|$

nn.L1Loss(size_average=None,reduce=None,reduction='mean')

通過代碼觀察其功能：

import torch
import torch.nn as nn
import torch.nn.functional as F
import matplotlib.pyplot as plt
import numpy as np
from toolss.common_tools import set_seed

set_seed(1)  # 設置隨機種子

inputs = torch.ones((2, 2))
target = torch.ones((2, 2)) * 3  # [[3,3],[3,3]]

loss_f = nn.L1Loss(reduction='none')
loss = loss_f(inputs, target)

print("input:{}\ntarget:{}\nL1 loss:{}".format(inputs, target, loss))

其輸出爲：

input:tensor([[1., 1.],
        [1., 1.]])
target:tensor([[3., 3.],
        [3., 3.]])
L1 loss:tensor([[2., 2.],
        [2., 2.]])

6、nn.MSELoss

功能：計算inputs與target之差的平方；
主要參數：reduction：計算模式，可爲none/sum/mean；none是逐個元素計算，sum是所有元素求和，返回標量；mean是加權平均，返回標量；
計算公式： $l_{n}=\left(x_{n}-y_{n}\right)^{2}$

nn.MSELoss(size_average=None,reduce=None,reduction='mean')

通過代碼觀察其功能：

import torch
import torch.nn as nn
import torch.nn.functional as F
import matplotlib.pyplot as plt
import numpy as np
from toolss.common_tools import set_seed

set_seed(1)  # 設置隨機種子

inputs = torch.ones((2, 2))
target = torch.ones((2, 2)) * 3  # [[3,3],[3,3]]

loss_f_mse = nn.MSELoss(reduction='none')
loss_mse = loss_f_mse(inputs, target)

print("MSE loss:{}".format(loss_mse))

其輸出爲：

MSE loss:tensor([[4., 4.],
        [4., 4.]])

7、nn.SmoothL1Loss

功能：平滑的L1Loss
主要參數：reduction：計算模式，可爲none/sum/mean；none是逐個元素計算，sum是所有元素求和，返回標量；mean是加權平均，返回標量；
計算公式： $\operatorname{loss}(x, y)=\frac{1}{n} \sum_{i} z_{i}$ $z_{i}=\left\{\begin{array}{ll} 0.5\left(x_{i}-y_{i}\right)^{2}, & \text { if }\left|x_{i}-y_{i}\right|<1 \\ \left|x_{i}-y_{i}\right|-0.5, & \text { otherwise } \end{array}\right.$ 公式中的 $x_i$ 是指模型的輸出， $y_i$ 是指標籤，看一下平滑的L1Loss和L1Loss的示意圖：

從圖中可以看到，X軸是 $x_i-y_i$ ，當 $-1<x_i-y_i<1$ 時，採用的是L2Loss，通過平方取代L1Loss來實現平滑的過程，通過平滑可以減輕離羣點帶來的影響。

nn.SmoothL1Loss(size_average=None,reduce=None,reduction='mean')

現在通過代碼觀看SmoothL1Loss的實現，代碼中同時使用L1Loss和SmoothL1Loss比較：

    inputs = torch.linspace(-3, 3, steps=500)  # -3到3均勻取500個點
    target = torch.zeros_like(inputs)  # target的形狀和inputs一樣，但是target值全爲零

    loss_f = nn.SmoothL1Loss(reduction='none')  # 平滑L1Loss

    loss_smooth = loss_f(inputs, target)

    loss_l1 = np.abs(inputs.numpy())  # L1Loss

    plt.plot(inputs.numpy(), loss_smooth.numpy(), label='Smooth L1 Loss')
    plt.plot(inputs.numpy(), loss_l1, label='L1 loss')
    plt.xlabel('x_i - y_i')
    plt.ylabel('loss value')
    plt.legend()
    plt.grid()
    plt.show()

上面代碼的輸出爲：

可以發現，其輸出和我們介紹的圖案是一樣的。

8、nn.PoissonNLLLoss

功能：泊松分佈的負對數似然損失函數；
主要參數：

log_input：輸入是否爲對數形式，決定計算形式；
full：計算所有損失，默認爲False；
eps：修正項，避免log(input)爲nan；

nn.PoissonNLLLoss(log_input=True,full=False,size_average=None,eps=1e-08,reduce=None,reduction='mean')

log_input = True：loss(input, target) = exp(input) - target * input
log_input = False：input - target * log(input + eps)

其代碼使用爲：

inputs = torch.randn((2, 2))
target = torch.randn((2, 2))

loss_f = nn.PoissonNLLLoss(log_input=True, full=False, reduction='none')
loss = loss_f(inputs, target)
print("input:{}\ntarget:{}\nPoisson NLL loss:{}".format(inputs, target, loss))

代碼的輸出爲：

input:tensor([[0.6614, 0.2669],
        [0.0617, 0.6213]])
target:tensor([[-0.4519, -0.1661],
        [-1.5228,  0.3817]])
Poisson NLL loss:tensor([[2.2363, 1.3503],
        [1.1575, 1.6242]])

9、nn.KLDivLoss

功能：計算KLD(divergence)，KL散度，相對熵；
注意事項：需提前將輸入計算log-probabilities，如通過nn.logsoftmax()；
主要參數：reduction：none/sum/mean/batchmean；batchsize是在batchsize維度求平均值；
計算公式： $D_{K L}(P \| Q)=E_{x, r}\left[\log \frac{P(x)}{Q(x)}\right]=E_{x-p}[\log P(x)-\log Q(x)]$ $=\sum_{i=1}^{N} P\left(x_{i}\right)\left(\log P\left(x_{i}\right)-\log Q\left(x_{i}\right)\right)$ 公式中的P是真實的分佈，Q是模型輸出的分佈，我們要讓Q的分佈逼近P的分佈 $l_{n}=y_{n} \cdot\left(\log y_{n}-x_{n}\right)$ 由於公式只是對一個樣本計算損失函數，因此沒有求和符號；KL散度公式中的 $P(x_i)$ 對應損失函數中的 $y_n$ ，loss函數中減去的是 $x_n$ ，而KL散度中是 $log(Q(x_i))$ ，因此需要提前將輸入計算log-probabilities，經過logsoftmax之後就可以在loss中直接減去 $x_n$ ，這可以通過Pytorch中的nn.logsoftmax()實現。

nn.KLDivLoss有一個特別的參數，就是reduction中的batchmean，基於batchsize維度求取平均值，除數不是元素個數而是batchsize大小。

nn.KLDivLoss(size_average=None, reduce=None,reduction='mean')

下面通過代碼學習nn.KLDivLoss，其代碼如下：

inputs = torch.tensor([[0.5, 0.3, 0.2], [0.2, 0.3, 0.5]])
inputs_log = torch.log(inputs)
target = torch.tensor([[0.9, 0.05, 0.05], [0.1, 0.7, 0.2]], dtype=torch.float)

loss_f_none = nn.KLDivLoss(reduction='none')
loss_f_mean = nn.KLDivLoss(reduction='mean')
loss_f_bs_mean = nn.KLDivLoss(reduction='batchmean')

loss_none = loss_f_none(inputs, target)
loss_mean = loss_f_mean(inputs, target)
loss_bs_mean = loss_f_bs_mean(inputs, target)

    print("loss_none:\n{}\nloss_mean:\n{}\nloss_bs_mean:\n{}".format(loss_none, loss_mean, loss_bs_mean))

其輸出爲：

loss_none:
tensor([[-0.5448, -0.1648, -0.1598],
        [-0.2503, -0.4597, -0.4219]])
loss_mean:
-0.3335360586643219
loss_bs_mean:
-1.000608205795288

10、nn.MarginRankingLoss

功能：用於計算兩個向量之間的相似度，用於排序任務；
特別說明：該方法計算兩組數據之間的差異，返回一個n*n的loss矩陣；
主要參數：

margin：邊界值，x1與x2之間的差異值；
reduction：計算模式，可爲none/sum/mean；
Loss計算公式： $\operatorname{loss}(x, y)=\max (0,-y *(x 1-x 2)+\operatorname{margin})$ 公式中的y指的是標籤，取值只能是1和-1；x1和x2就是兩組數據，兩個向量的每個元素計算其差值。

考慮一下Loss公式中-y的作用：

y=1時，希望x1比x2大，當x1大於x2時，不產生loss；
y=-1時，希望x2比x1大，當x2大於x1時，不產生loss；

特別說明中提及該方法用於計算兩組數據之間的差異，它會對兩組數據中的元素分別計算差異。比如有兩個形狀爲[1,3]的數據，經過計算後，該方法會得到一個3*3的loss矩陣；第一組數據中的第一個元素會分別和第二組數據中的3個元素計算loss；

nn.MarginRankingLoss(margin=0.0,size_average=None,reduce=none，reduction='mean')

下面通過代碼學習nn.MarginRankingLoss的具體使用：

x1 = torch.tensor([[1], [2], [3]], dtype=torch.float)
x2 = torch.tensor([[2], [2], [2]], dtype=torch.float)

target = torch.tensor([1, 1, -1], dtype=torch.float)

loss_f_none = nn.MarginRankingLoss(margin=0, reduction='none')

loss = loss_f_none(x1, x2, target)

print(loss)

其輸出爲：

tensor([[1., 1., 0.],
        [0., 0., 0.],
        [0., 0., 1.]])

11、nn.MultiLabelMarginLoss

功能：多標籤邊界損失函數；
主要參數：

reduction：計算模式，可爲none/sum/mean；
舉例：四分類任務，樣本x屬於0類和3類，標籤表示爲[0,3,-1,-1]而不是[1,0,0,1]；
loss計算公式： $\operatorname{loss}(x, y)=\sum_{i j} \frac{\max (0,1-(x[y[j]]-x[i]))}{x \cdot \operatorname{size}(0)}$ 公式中分母x.size(0)是輸出向量神經元個數，分子中的i的取值範圍爲0到x.size()，j的取值範圍爲0到y.size()，y[j]大於等於0，同時i不等於y[j]。

假設四分類的標籤爲[0,3,-1-1]，則分子中的x[y[i]]只能是x[0]和x[3]，同樣，分子中的x[i]只能是x[1]和x[2]。

nn.MultiLabelMarginLoss(size_average=None,reduce=None,reduction='mean')

通過代碼觀察MultiLabelMarginLoss的實際操作：

x = torch.tensor([[0.1, 0.2, 0.4, 0.8]])
y = torch.tensor([[0, 3, -1, -1]], dtype=torch.long)

loss_f = nn.MultiLabelMarginLoss(reduction='none')

loss = loss_f(x, y)

print(loss)

代碼的輸出爲：

tensor([0.8500])

下面通過手動計算MultiLabelMarginLoss的損失：

x = x[0]
item_1 = (1-(x[0] - x[1])) + (1 - (x[0] - x[2]))    # [0]
item_2 = (1-(x[3] - x[1])) + (1 - (x[3] - x[2]))    # [3]

loss_h = (item_1 + item_2) / x.shape[0]

print(loss_h)

得到的輸出和Pytorch中實現的MultiLabelMarginLoss是一樣的。

12、nn.softMarginLoss

功能：計算二分類的logistic損失；
主要參數：計算模式，可爲none/sum/mean；
loss計算公式： $\operatorname{loss}(x, y)=\sum_{i} \frac{\log (1+\exp (-y[i] * x[i]))}{\text { x.nelement } 0}$

nn.softMarginLoss(size_average=None,reduce=None,reduction='mean')

通過代碼看一下nn.softMarginLoss的具體操作：

inputs = torch.tensor([[0.3, 0.7], [0.5, 0.5]])
target = torch.tensor([[-1, 1], [1, -1]], dtype=torch.float)

loss_f = nn.SoftMarginLoss(reduction='none')

loss = loss_f(inputs, target)

print("SoftMargin: ", loss)

其對應的輸出爲

SoftMargin:  tensor([[0.8544, 0.4032],
        [0.4741, 0.9741]])

13、nn.MultiLabelSoftMarginLoss

功能：SoftMarginLoss多標籤版本
主要參數：

weight：各類別的loss設置權值；
reduction：計算模式，可爲none/sum/mean；

loss計算公式： $\operatorname{loss}(x, y)=-\frac{1}{C} * \sum_{i} y[i] * \log \left((1+\exp (-x[i]))^{-1}\right)+(1-y[i]) * \log \left(\frac{\exp (-x[i])}{(1+\exp (-x[i]))}\right)$

nn.MultiLabelSoftMarginLoss(weight=None,size_average=None,reduce=None,reduction='mean')

下面通過代碼觀察一下其功能：

inputs = torch.tensor([[0.3, 0.7, 0.8]])
target = torch.tensor([[0, 1, 1]], dtype=torch.float)

loss_f = nn.MultiLabelSoftMarginLoss(reduction='none')

loss = loss_f(inputs, target)

print("MultiLabel SoftMargin: ", loss)

代碼對應的輸出爲：

MultiLabel SoftMargin:  tensor([0.5429])

14、nn.MultiMarginLoss

功能：計算多分類的摺頁損失；
主要參數：

P：可選1或2；
weight：各類別的loss設置權值；
margin：邊界值；
reduction：計算模式，可爲none/sum/mean；
loss計算公式： $\operatorname{loss}(x, y)=\frac{\left.\sum_{i} \max (0, \operatorname{margin}-x[y]+x[i])\right)^{p}}{\mathrm{x} . \operatorname{size}(0)}$ 公式中 $x \in\{0, \cdots, \text { x.size }(0)-1\}, y \in\{0, \cdots, \text { y.size }(0)-1\}, 0 \leq y[j] \leq x . \text { size }(0)-1$ 對於所有的 $i$ 和 $j$ ，有 $i \neq y[j]$

nn.MultiMarginLoss(p=1,margin=1.0,weight=None,size_average=None,reduce=None,reduction='mean')

其代碼例子爲：

x = torch.tensor([[0.1, 0.2, 0.7], [0.2, 0.5, 0.3]])
y = torch.tensor([1, 2], dtype=torch.long)

loss_f = nn.MultiMarginLoss(reduction='none')

loss = loss_f(x, y)

print("Multi Margin Loss: ", loss)

代碼輸出爲

Multi Margin Loss:  tensor([0.8000, 0.7000])

15、nn.TripleMarginLoss

功能：計算三元組損失，人臉驗證中常用；
主要參數：

P：範數的階，默認爲2；
margin：邊界值；
reduction：計算模式，可爲none/sum/mean；
loss計算公式： $L(a, p, n)=\max \left\{d\left(a_{i}, p_{i}\right)-d\left(a_{i}, n_{i}\right)+\text { margin, } 0\right\}$ $d\left(x_{i}, y_{i}\right)=\left\|\mathbf{x}_{i}-\mathbf{y}_{i}\right\|_{p_{p}}$

nn.TripleMarginLoss(margin=1.0,eps=1e-06,swap=False,size_average=None,reduce=None,reduction='mean')

下面通過代碼觀察其使用：

anchor = torch.tensor([[1.]])
pos = torch.tensor([[2.]])
neg = torch.tensor([[0.5]])

loss_f = nn.TripletMarginLoss(margin=1.0, p=1)

loss = loss_f(anchor, pos, neg)

print("Triplet Margin Loss", loss)

代碼對應輸出爲：

Triplet Margin Loss tensor(1.5000)

16、nn.HingeEmbeddingLoss

功能：計算兩個輸入的相似性，常用於非線性embedding和半監督學習；
特別注意：輸入X應該爲兩個輸入之差的絕對值；
主要參數：

margin：邊界值；
reduction：計算模式，可爲none/sum/mean；
損失函數： $l_{n}=\left\{\begin{array}{ll} x_{n}, & \text { if } y_{n}=1 \\ \max \left\{0, \Delta-x_{n}\right\}, & \text { if } y_{n}=-1 \end{array}\right.$

nn.HingeEmbeddingLoss(margin=1.0,size_average=None,reduce=None,reduction='mean')

下面通過代碼觀察其功能：

inputs = torch.tensor([[1., 0.8, 0.5]])
target = torch.tensor([[1, 1, -1]])

loss_f = nn.HingeEmbeddingLoss(margin=1, reduction='none')

loss = loss_f(inputs, target)

print("Hinge Embedding Loss", loss)

代碼的輸出爲：

Hinge Embedding Loss tensor([[1.0000, 0.8000, 0.5000]])

17、nn.CosineEmbeddingLoss

功能：使用餘弦相似度計算兩個輸入的相似性；
主要參數： $\operatorname{loss}(x, y)=\left\{\begin{array}{ll} 1-\cos \left(x_{1}, x_{2}\right), & \text { if } y=1 \\ \max \left(0, \cos \left(x_{1}, x_{2}\right)-\operatorname{margin}\right), & \text { if } y=-1 \end{array}\right.$ $\cos (\theta)=\frac{A \cdot B}{\|A\|\|B\|}=\frac{\sum_{i=1}^{n} A_{i} \times B_{i}}{\sqrt{\sum_{i=1}^{n}\left(A_{i}\right)^{2}} \times \sqrt{\sum_{i=1}^{n}\left(B_{i}\right)^{2}}}$

margin：可取值[-1,1]，推薦爲[0,0.5]；
reduction：計算模式，可爲none/sum/mean；
loss損失函數計算公式：

nn.CosineEmbeddingLoss(margin=0.0,size_average=None,reduce=None,reduction='mean')

下面通過代碼觀察nn.CosineEmbeddingLoss功能：

x1 = torch.tensor([[0.3, 0.5, 0.7], [0.3, 0.5, 0.7]])
x2 = torch.tensor([[0.1, 0.3, 0.5], [0.1, 0.3, 0.5]])

target = torch.tensor([[1, -1]], dtype=torch.float)

loss_f = nn.CosineEmbeddingLoss(margin=0., reduction='none')

loss = loss_f(x1, x2, target)

print("Cosine Embedding Loss", loss)

代碼的輸出爲：

Cosine Embedding Loss tensor([[0.0167, 0.9833]])

18、nn.CTCLoss

功能：計算CTC損失，解決時序數據的分類；
主要參數：

blank：blank label；
zero_infinity：無窮大的值或梯度置0；
reduction：計算模式，可爲none/sum/mean；

torch.nn.CTCLoss(blank=0,reduction='mean',zero_infinity=False)

Pytorch —— 損失函數（二）

目錄

5、nn.L1Loss

6、nn.MSELoss

7、nn.SmoothL1Loss

8、nn.PoissonNLLLoss

9、nn.KLDivLoss

10、nn.MarginRankingLoss

11、nn.MultiLabelMarginLoss

12、nn.softMarginLoss

13、nn.MultiLabelSoftMarginLoss

14、nn.MultiMarginLoss

15、nn.TripleMarginLoss

16、nn.HingeEmbeddingLoss

17、nn.CosineEmbeddingLoss

18、nn.CTCLoss

leetcode —— 959. 由斜槓劃分區域

Python詞彙比較運算符

Python —— any()函數和all()函數

Pytorch —— 模型保存與加載

leetcode —— 40. 組合總和 II

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結