FM(factorization Machines)

《Factorization Machines》paper的閱讀筆記，僅爲了整理個人思路。

個人覺得FM的本質就是預測值=偏置+權重1單變量+權重2變量之間的相互作用。
偏置和權重都可以是標量，也可以是向量

下面是本人認爲重要的文章內容摘抄與翻譯，能力有限，水平不足，不喜請繞道。

一 FM的優點

- 能夠估計SVM所不能的稀疏矩陣的參數
（FMs allow parameter estimation under very sparse data where SVMs fail）

- FM 具有線性複雜性（相當於SVM中的多項核），能夠在原始數據中進行優化，無需像SVM一樣依賴支持向量。
（FMs have linear complexity,can be optimized in the primal and do not rely on support vectors like SVMs）

- FM 具有一般性，能夠適用於任何真實值的特徵向量,能夠模擬偏置MF，SVD++，PITF，FPMC等最先進的模型。
（FMs are a general predictor that can work with any real valued feature vector.In contrast to this ,other state-of-the-art factorization models work only on very restricted input data.We will show that just by defining the feature vectors of the input data,FMs can mimic state-of-the- art models like biased MF ,SVD++,PITF,or FPMC.）

二 FM模型的公式

$\hat{y}(x) = w_0 +\sum_{i=1}^{n}w_ix_i + \sum_{i-1}^{n}\sum_{j=i+1}{n}<v_i,v_j>x_ix_j$
$w_0 \in R$ , $w\in R^n$ , $V\in R^{n*k}$ ,
<.,.>是大小爲K的兩個向量的點積， $<v_i,v_j> = \sum_{f=1}^{k}v_{i,f}.v_{j,f}$

V中的行向量 $v_i$ 代表的是有K個因子的第i個變量。
$k \in N_{0}^{+}$ 是定義因子的超參。

（A row $w_i$ within V describes the $i$ -th variable with k factors. $k \in N_{0}^{+}$ is a hyperparameter that defines the dimensionality of the factorization）

自由度爲2的FM能夠捕捉單變量和變量之間相互作用。
(A 2-way FM(degree d = 2) captures all single and pairwise interactions between variables)

$w_0$ 是全局變量
$w_i$ 模擬第i個變量的strength（個人覺得其實就是權重，models the strength of the i-th variable）
$\hat w_{i,j} = <v_i,v_j>$ 模擬第i和第j個變量之間的相互作用。（個人覺得其實就是權重，models the interaction between the i-th and j-th variable）

三 FM模型的表達能力

假設K足夠大，對於任何正定矩陣W，存在一個矩陣V滿足 $V.V_t$ 。也就是說，如果K的選擇足夠大，FM便能夠表達任意的相互作用向量W。爲了使模型具有更好的泛化能力，在稀疏數據集中，通常選在比較小的K,。

（It is well known that for any positive definite matrix W, there exists a matrix V such thta $W=V.V^t$ provided that $k$ is large enough. Nevertheless, in sparse settings,typically a small $k$ shold be chosen because there is not engough data to estimate complex interactions W.Restricting K - and thus the expressiveness of the FM -leads to better generalization and thus improved interaction matrics under sparsity）

FM(factorization Machines)

一 FM的優點

二 FM模型的公式

三 FM模型的表達能力

install quantopian時出現No module named pip.req的解決辦法

ADF檢驗

python中去掉列表降維:ravel,flatten,reshape

WOE,IV ,PSI，單變量PSI，KS值，capture rate

FM(factorization Machines)

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結