這種預測算法是假設評分數據是來自一個正態分佈的數據
現有一組用戶對電影的評分數據,這是一個稀疏矩陣,其中含有很多空白數據,我們要做的就是對這些數據進行預測.
下面對數據進行建模:
假設所有的預測數據r̂ ui服從一個正太分佈(μ̂ ,σ̂ 2) ,屌絲手寫
下面我們來實用surprise來直接實現
from surprise import Dataset
from surprise import NormalPredictor, evaluate
from surprise.model_selection import cross_validate
data = Dataset.load_builtin('ml-100k')
# Select the algorithm
algo = NormalPredictor()
# deprecated
# evaluate(algo, data, measures=['RMSE', 'MAE'])
# Test the model
pref = cross_validate(algo, data, verbose=True)
print(pref)
測試結果:
Evaluating RMSE, MAE of algorithm NormalPredictor on 5 split(s).
Fold 1 Fold 2 Fold 3 Fold 4 Fold 5 Mean Std
RMSE (testset) 1.5054 1.5228 1.5109 1.5185 1.5133 1.5142 0.0060
MAE (testset) 1.2070 1.2233 1.2137 1.2216 1.2161 1.2163 0.0058
Fit time 0.08 0.13 0.09 0.10 0.07 0.09 0.02
Test time 0.16 0.19 0.18 0.18 0.13 0.17 0.02
{'test_rmse': array([1.50536524, 1.5228481 , 1.51093185, 1.51848558, 1.51326424]), 'test_mae': array([1.20696218, 1.22329666, 1.21374827, 1.22158879, 1.21606117]), 'fit_time': (0.07692193984985352, 0.12638354301452637, 0.09193754196166992, 0.09979796409606934, 0.0673062801361084), 'test_time': (0.1640009880065918, 0.19167232513427734, 0.18456101417541504, 0.1820223331451416, 0.13440251350402832)}