《Factorization Machines》paper的閱讀筆記,僅爲了整理個人思路。
個人覺得FM的本質就是預測值=偏置+權重1單變量+權重2變量之間的相互作用。
偏置和權重都可以是標量,也可以是向量
下面是本人認爲重要的文章內容摘抄與翻譯,能力有限,水平不足,不喜請繞道。
一 FM的優點
- 能夠估計SVM所不能的稀疏矩陣的參數
(FMs allow parameter estimation under very sparse data where SVMs fail)
- FM 具有線性複雜性(相當於SVM中的多項核),能夠在原始數據中進行優化,無需像SVM一樣依賴支持向量。
(FMs have linear complexity,can be optimized in the primal and do not rely on support vectors like SVMs)
- FM 具有一般性,能夠適用於任何真實值的特徵向量,能夠模擬偏置MF,SVD++,PITF,FPMC等最先進的模型。
(FMs are a general predictor that can work with any real valued feature vector.In contrast to this ,other state-of-the-art factorization models work only on very restricted input data.We will show that just by defining the feature vectors of the input data,FMs can mimic state-of-the- art models like biased MF ,SVD++,PITF,or FPMC.)
二 FM模型的公式
,,,
<.,.>是大小爲K的兩個向量的點積,
V中的行向量代表的是有K個因子的第i個變量。
是定義因子的超參。
(A row within V describes the -th variable with k factors. is a hyperparameter that defines the dimensionality of the factorization)
自由度爲2的FM能夠捕捉單變量和變量之間相互作用。
(A 2-way FM(degree d = 2) captures all single and pairwise interactions between variables)
- 是全局變量
- 模擬第i個變量的strength(個人覺得其實就是權重,models the strength of the i-th variable)
- 模擬第i和第j個變量之間的相互作用。(個人覺得其實就是權重,models the interaction between the i-th and j-th variable)
三 FM模型的表達能力
假設K足夠大,對於任何正定矩陣W,存在一個矩陣V滿足。也就是說,如果K的選擇足夠大,FM便能夠表達任意的相互作用向量W。爲了使模型具有更好的泛化能力,在稀疏數據集中,通常選在比較小的K,。
(It is well known that for any positive definite matrix W, there exists a matrix V such thta provided that is large enough. Nevertheless, in sparse settings,typically a small shold be chosen because there is not engough data to estimate complex interactions W.Restricting K - and thus the expressiveness of the FM -leads to better generalization and thus improved interaction matrics under sparsity)