相似度計算合集

原創

寒夏12

2020-02-21 07:04

起因：

最近在寫大論文，會涉及到一些相似度的原理，在這裏做個筆記。

餘弦相似度

上公式：

上原理：

上示例：

優缺點：

從公式的計算方式與實際情況相結合進行分析總結。（我就不寫了，論文查重，你懂的！）

上代碼：

import numpy as np
def cos(a,b):
	a = np.array(a)
	b = np.array(b)
	mul = sum(a * b)
	m = np.sqrt(sum(a**2)) * np.sqrt(sum(b**2))
	result = mul / m
	return result

Kendall相似係數

上公式：

上示例：

上代碼：

def Kendall(x, y=None):
    print("用戶一評論項目：",x)
    print("用戶二評論項目：", y)
    cnt = len(x)
    print("x(即item共有)長度：",cnt)
    coeff = 0.0
    co_num = 0 # 記錄比較的次數
    if y == None:
        y = range(1, cnt + 1)
    for i in range(cnt):
        if x[i] ==0 or y[i] == 0:
            continue
        for j in range(cnt):
            j += i
            if(i==j):
                continue
            if(j>=cnt): # 此1]-x[2]再計算x[2]-x[1]
                break
            if x[j] == 0 or y[j] == 0:
                continue
            co_num += 1
            mark = (x[i] - x[j]) * (y[i] - y[j])
            if mark >= 0:
                coeff += 1.0
            elif mark < 0:
                coeff -= 1.0
    print("一致性元素減去非一致性元素的對數：",coeff) # 一致同減的幅度不同)
    print("比較次數",co_num)
    kendall_sim = coeff / co_num #一致性元素減去非一致性元素除以比較的次數
    return kendall_sim

Jaccard

上公式：

上示例：

上代碼：

def jaccard(p,q):
    i = 0 #兩個用戶共有的項目數
    i_p = 0 # 用戶p評價的項目數
    i_q = 0 # 用戶q評價的項目數
    for j in range(len(p)): # len(p)爲總的項目數
        if p[j] != 0 and q[j] != 0: # 計算p與q共有項目的個數i
            i = i+1
        else:
            continue
    for j in range(len(p)):
        if p[j] != 0:
            i_p = i_p + 1
        else:
            continue
    for j in range(len(p)):
        if q[j] != 0:
            i_q += 1
        else:
            continue
    print("用戶一評論的項目數：%d"%(i_p))
    print("用戶二評論的項目數：%d"%(i_q))
    print("用戶一與用戶二共同評論的項目數：%d"%(i))
    return i / (i_p + i_q - i)

寒夏12

發佈了29 篇原創文章 · 獲贊 22 · 訪問量 4萬+

私信關注

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

相似度計算合集

起因：

餘弦相似度

上公式：

上原理：

上示例：

優缺點：

上代碼：

Kendall相似係數

上公式：

上示例：

上代碼：

Jaccard

上公式：

上示例：

上代碼：

記一次 .NET某工業設計軟件崩潰分析

創建 Vue3 項目

TS + Webpack 整合 Jest

分享5款.NET開源免費的Redis客戶端組件庫

安卓手機如何登錄抖音境外版

golang開發 gorilla websocket的使用

面試官：如果不允許線程池丟棄任務，應該選擇哪個拒絕策略？

嵌入式汽車電子學習路線

Mac卸載 Node npm，升級 Node

uni.showModel內容換行

Ubuntu斷點續存之wget

Python裝飾器介紹

MySQL必備技能之設置編碼(九)

便捷工具分享_積分獲取

Hadoop啓動後Live nodes中卻缺少節點

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結