推薦系統Surprise-Basic algorithms之NormalPredictor

原創

Eric_keke

2018-09-01 19:17

這種預測算法是假設評分數據是來自一個正態分佈的數據

現有一組用戶對電影的評分數據,這是一個稀疏矩陣,其中含有很多空白數據,我們要做的就是對這些數據進行預測.

下面對數據進行建模:
假設所有的預測數據r̂ ui服從一個正太分佈(μ̂ ,σ̂ 2) ,屌絲手寫

\begin{aligned} \hat{μ} & = \frac{1}{| R_{t r a i n} |} \sum_{r_{u i} \in R_{t r a i n}} r_{u i} \\ \hat{σ} & = \sqrt{\sum_{r_{u i} \in R_{t r a i n}} \frac{(r_{u i} - \hat{μ})^{2}}{| R_{t r a i n} |}} \end{aligned}

下面我們來實用surprise來直接實現

from surprise import Dataset
from surprise import NormalPredictor, evaluate
from surprise.model_selection import cross_validate
data = Dataset.load_builtin('ml-100k')

# Select the algorithm
algo = NormalPredictor()

# deprecated
# evaluate(algo, data, measures=['RMSE', 'MAE'])

# Test the model
pref = cross_validate(algo, data, verbose=True)

print(pref)

測試結果:

Evaluating RMSE, MAE of algorithm NormalPredictor on 5 split(s).

                  Fold 1  Fold 2  Fold 3  Fold 4  Fold 5  Mean    Std     
RMSE (testset)    1.5054  1.5228  1.5109  1.5185  1.5133  1.5142  0.0060  
MAE (testset)     1.2070  1.2233  1.2137  1.2216  1.2161  1.2163  0.0058  
Fit time          0.08    0.13    0.09    0.10    0.07    0.09    0.02    
Test time         0.16    0.19    0.18    0.18    0.13    0.17    0.02    
{'test_rmse': array([1.50536524, 1.5228481 , 1.51093185, 1.51848558, 1.51326424]), 'test_mae': array([1.20696218, 1.22329666, 1.21374827, 1.22158879, 1.21606117]), 'fit_time': (0.07692193984985352, 0.12638354301452637, 0.09193754196166992, 0.09979796409606934, 0.0673062801361084), 'test_time': (0.1640009880065918, 0.19167232513427734, 0.18456101417541504, 0.1820223331451416, 0.13440251350402832)}

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

推薦系統Surprise-Basic algorithms之NormalPredictor

Wireshark 安裝+使用（一）

博客園商業化之路-衆包平臺：繼續召集早期合作開發者

Java之對象轉型（casting）

Java之interface關鍵字

ubuntu安裝JDK配置環境變量

Java之重寫

Object之toString方法

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結