Python小練習：向量之間的距離度量

作者：凱魯嘎吉 - 博客園 http://www.cnblogs.com/kailugaji/

本文主要用Python實現三種常見的向量之間的距離度量方式：

1）曼哈頓距離(Manhattan distance, L1範數)：$d(x,y) = \sum\limits_{i = 1}^n {\left| {{x_i} - {y_i}} \right|} $

2）歐氏距離(Euclidean distance，L2範數)：$d(x,y) = \sqrt {\sum\limits_{i = 1}^n {{{({x_i} - {y_i})}^2}} } $

3）餘弦相似度(Cosine similarity)：$d(x,y) = \frac{{x{y^T}}}{{\left\| x \right\|\left\| y \right\|}}$

其中，$x,y \in \mathbb{R}{^{1 \times n}}$

1. loss_test.py

 1 # -*- coding: utf-8 -*-
 2 # Author：凱魯嘎吉 Coral Gajic
 3 # https://www.cnblogs.com/kailugaji/
 4 # Python小練習：向量之間的距離度量
 5 # Python實現兩向量之間的：
 6 # 1）曼哈頓距離(Manhattan distance, L1範數)
 7 # 2）歐氏距離(Euclidean distance，L2範數)
 8 # 3）餘弦相似度(Cosine similarity)
 9 import torch
10 import torch.nn.functional as F
11 # 自己寫的距離度量函數
12 def compute_l1_similarity(e1, e2): # L1距離
13     return torch.abs(e1 - e2).sum(-1)
14 def compute_l2_similarity(e1, e2): # L2距離
15     return ((e1 - e2)**2).sum(-1).sqrt()
16     # 注意：這裏開根號了，沒平方
17 def compute_cosine_similarity(e1, e2): # cosine距離
18     e1 = e1 / torch.norm(e1, dim=-1, p=2, keepdim=True)
19     e2 = e2 / torch.norm(e2, dim=-1, p=2, keepdim=True)
20     similarity = torch.mul(e1, e2).sum(1) # mul: 點乘
21     return similarity
22     # 後兩行也可替換爲：
23     # similarity = torch.mm(e1, torch.t(e2)) # mm: 相乘，torch.t: 轉置
24     # return torch.diag(similarity) # 只取對角線元素
25 
26 torch.manual_seed(1)
27 n = 3 # 樣本個數
28 m = 5 # 樣本維度
29 # 僅考慮e1的第i個樣本和e2的第i個樣本之間計算距離
30 # 不考慮e1的i個樣本和e2的第j個樣本之間的距離(i≠j)
31 e1 = torch.rand(n, m)
32 e2 = torch.rand(n, m)
33 print('原始數據爲：\n', e1, '\n', e2)
34 loss_l1_1 = torch.zeros(n)
35 loss_l2_1 = torch.zeros(n)
36 # 自己寫的距離度量函數
37 loss_l1 = compute_l1_similarity(e1, e2)
38 loss_l2 = compute_l2_similarity(e1, e2)
39 loss_cosine = compute_cosine_similarity(e1, e2)
40 # pytorch庫裏自帶的距離度量函數
41 for i in range(n):
42     loss_l1_1[i] = torch.dist(e1[i], e2[i], p=1)
43     loss_l2_1[i] = torch.dist(e1[i], e2[i], p=2)
44 loss_cosine_1 = F.cosine_similarity(e1, e2)
45 # 第一個結果是自己寫的函數
46 # 第二個結果是pytorch庫裏自帶的函數
47 # n是多少，就出來多少個值
48 print('兩者的曼哈頓距離爲：\n', loss_l1, '\n', loss_l1_1)
49 print('兩者的歐式距離爲：\n', loss_l2, '\n', loss_l2_1)
50 print('兩者的餘弦相似度爲：\n', loss_cosine, '\n', loss_cosine_1)

2. 結果

D:\ProgramData\Anaconda3\python.exe "D:/Python code/2023.3 exercise/loss/loss_test.py"
原始數據爲：
 tensor([[0.7576, 0.2793, 0.4031, 0.7347, 0.0293],
        [0.7999, 0.3971, 0.7544, 0.5695, 0.4388],
        [0.6387, 0.5247, 0.6826, 0.3051, 0.4635]]) 
 tensor([[0.4550, 0.5725, 0.4980, 0.9371, 0.6556],
        [0.3138, 0.1980, 0.4162, 0.2843, 0.3398],
        [0.5239, 0.7981, 0.7718, 0.0112, 0.8100]])
兩者的曼哈頓距離爲：
 tensor([1.5195, 1.4075, 1.1176]) 
 tensor([1.5195, 1.4075, 1.1176])
兩者的歐式距離爲：
 tensor([0.7873, 0.6938, 0.5498]) 
 tensor([0.7873, 0.6938, 0.5498])
兩者的餘弦相似度爲：
 tensor([0.8395, 0.9767, 0.9345]) 
 tensor([0.8395, 0.9767, 0.9345])

Process finished with exit code 0

注意：這裏只是求向量之間的距離度量，並不是矩陣範數。上下兩個結果分別爲自己根據距離定義寫的函數、pytorch自帶的函數，可以看到得到的結果是一致的。

3. 參考文獻

[1] 相似性度量 – 凱魯嘎吉 – 博客園

[2] 向量範數與矩陣範數 – 凱魯嘎吉 – 博客園

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

Python小練習：向量之間的距離度量

Python小練習：向量之間的距離度量

1. loss_test.py

2. 結果

3. 參考文獻

.Net 8.0 下的新RPC，IceRPC之試試的新玩法"打洞"

關於遊戲付費的一點想法

我通過CKA和CKS啦！

《最新出爐》系列入門篇-Python+Playwright自動化測試-42-強大的可視化追蹤利器Trace Viewer

大數據怎麼學？對大數據開發領域及崗位的詳細解讀，完整理解大數據開發領域技術體系

Python與MATLAB小練習：計算準確度Accuracy

Python小練習：解決strftime()中國時區亂碼問題

Python小練習：object類型數據加載 Windows下OpenAI gym環境的使用

The Cross-Entropy Loss Function for the Softmax Function Python小練習：Sinkhorn-Knopp算法

Python小練習：權重初始化（Weight Initialization）

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結