聯邦推薦系統——個性化推薦與隱私安全的兼顧者

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"智能互聯網時代,我們的生活正被各式各樣的推薦系統包圍着。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"從電商購物、在線視頻到新聞流,推薦系統已然成爲智能時代的關鍵技術,爲我們提供着“千人千面”的服務。爲了實現精準的推薦效果,推薦系統會收集大量用戶行爲數據。一般而言,收集的數據越多,對用戶和推薦內容的瞭解就越全面和深入,推薦效果越精準。但隨着數據安全與隱私保護的相關法律政策出臺與實施,這些數據通常出於保護用戶數據隱私目的,而以“數據孤島”的形式分散在不同的機構。如何在合理合法的前提下,充分使用數據持續優化效果、提供優質服務,是當前推薦系統所面的巨大挑戰和首要任務。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" ","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"事實上,聯邦學習作爲一種解決數據隱私問題的重要路徑,當聯邦學習與推薦系統擦出火花,能否爲我們提供一種既能優化個性化推薦效果,又能保障個人隱私數據安全的新思路?","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/c4/c48825d67b444b7244e4aee3f91f30f7.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在此,我們將介紹一種新的概念——聯邦推薦系統。它是聯邦學習在推薦系統應用場景中的一個實例,爲我們解決推薦系統隱私保護與數據稀缺提供了一個重要思路。在本文中,我們正式定義聯邦推薦系統後,會對現有的聯邦推薦方法的分類與發展展開討論。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"論文來源(引用格式):L. Yang, B. Tan, V. W. Zheng, K. Chen, and Q. Yang, “Federated recommendation systems,” in Federated Learning. Springer, 2020,pp. 225–23","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"論文鏈接:","attrs":{}},{"type":"link","attrs":{"href":"https://link.springer.com/chapter/10.1007/978-3-030-63076-8_16","title":null,"type":null},"content":[{"type":"text","text":"https://link.springer.com/chapter/10.1007/978-3-030-63076-8_16","attrs":{}}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"論文作者:Liu Yang, Ben Tan, Vincent W. Zheng, Kai Chen, Qiang Yang","attrs":{}}]}],"attrs":{}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"一、聯邦推薦系統定義","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/07/07e63117186534ca8a692f47f3288008.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"定義 . 聯邦推薦系統的目標是在不直接訪問彼此隱私數據的情況下,在多方之間協作訓練推薦模型:","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/66/66ce72537e48cefcf871b5efad5722e1.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我們希望聯邦推薦訓練得到的模型效果,要比每個推薦系統單獨本地訓練得到的模型效果好,同時非常接近在不考慮數據隱私和安全的情況下將各方的數據簡單地聚合在一起訓練得到的模型效果:","attrs":{}}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/86/86a75e444e358230953b75a5169c81f3.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"二、聯邦推薦系統的分類","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":" ","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"推薦系統中算法的目標是挖掘用戶和內容、商品之間的聯繫。根據不同特點,聯邦推薦系統可總結成三類:橫向聯邦推薦系統、縱向聯邦推薦系統和遷移聯邦推薦系統。","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"1、橫向聯邦推薦系統","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在橫向聯邦推薦系統中,主要解決參與方擁有大量相同的商品或服務,但用戶羣體不同時的推薦系統協作問題,例如不同地區的影視推薦服務之間的聯邦建模。","attrs":{}}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/de/debe3792f206a198192109aa0fcb9eb7.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#333333","name":"user"}}],"text":"橫向聯邦推薦系統的代表應用場景:","attrs":{}}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/9e/9ebd8cb1eff2518c4d5dbed4feb0adf3.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#333333","name":"user"}}],"text":"如圖所示,用戶享受個性化影視推薦服務時,卻不想讓自己的隱私數據被推薦系統收集。那麼,爲了保護用戶的數據隱私,在影視推薦系統中,我們可以將訓練數據留在本地,而將包含某個用戶習慣與評分信息的用戶設備視爲參與方構建聯邦推薦系統,來同時滿足個性化和隱私的要求。","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#333333","name":"user"}}],"text":"2、縱向聯邦推薦算法","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#333333","name":"user"}}],"text":"縱向聯邦推薦主要解決參與方擁有大量相同的用戶,但是不同的商品或用戶特徵時如何協作構建推薦系統的問題。","attrs":{}}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/40/408f7ea271a863487259cba5b632750f.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#333333","name":"user"}}],"text":"縱向聯邦推薦系統的代表場景:","attrs":{}}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/48/480b3d5bb39d80032e1308c03ba873b7.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#333333","name":"user"}}],"text":"如圖所示,在縱向聯邦推薦中,參與方可以是不同的推薦系統,也可以是推薦系統與數據提供方。例如,新聞推薦服務商與影視推薦服務商之間的聯邦,或者新聞推薦服務商與用戶數據提供商間的聯邦。他們之間存在很多的共同用戶特性,縱向聯邦推薦系統可以幫助他們實現在不泄露雙方數據隱私的情況下構建更好的推薦服務。","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#333333","name":"user"}}],"text":"3、遷移聯邦推薦算法","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#333333","name":"user"}}],"text":"遷移聯邦推薦主要解決參與方在相同用戶和商品都不多的情況,如何協作分享經驗構建推薦系統的問題。","attrs":{}}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/a1/a1ebcbb567196bb5850e2322bf1c6a1b.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"遷移聯邦系統的代表應用場景:","attrs":{}}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/91/91926313287d7b76a1ceae3b4caa1181.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"如上圖,A 地區的書籍推薦系統希望幫助 B 地區的影視推薦系統優化影視推薦效果。在這種情況下,兩個參與方所提供的服務有所不同。但是,在遷移聯邦推薦系統之下,可以將相似的用戶特徵在兩個參與方之間做遷移,從而在隱私保護的前提下,提升 B 地區電影推薦系統的模型推薦效果。","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"三、聯邦推薦系統挑戰和未來發展方向","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" ","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"工業上,聯邦推薦系統不僅僅包含聯邦推薦算法,還應包含對於系統的全面設計。因此,我們對挑戰的討論分爲算法層和系統層。在算法層面,我們着重分析當前推薦領域中使用主流的模型設計不同聯邦推薦算法可能遇到的困難;在系統層面,主要分析不同推薦系統的特點與設計聯邦推薦系統可能遇到的幾個關鍵挑戰。","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"1、算法層面的挑戰","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":5},"content":[{"type":"text","text":"1) 聯邦深度模型","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"深度推薦模型在使用非線性激活函數時會導致嚴重的問題。例如tanh 和 relu 激活函數等複雜的函數,沒有很好地得到 HE 的支持。這一侷限性嚴重影響了深度模型在聯邦推薦系統中的應用。","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":5},"content":[{"type":"text","text":"2) 聯邦圖模型","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"保護圖中結構信息的隱私是基於圖的模型聯邦化的主要難點。基於圖的推薦模型利用用戶和商品之間的關係信息來豐富用戶和商品的表示。相對於特徵信息,關係信息更加複雜,保護關係信息中的隱私更加不易。","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":5},"content":[{"type":"text","text":"3) 聯邦強化學習模型","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"聯邦化強化學習模型的挑戰在於如何設計更好的狀態、動作和獎勵,以捕捉用戶的即時興趣,同時確定各參與方之間應該共享的內容。雖然強化學習在推薦系統中有着重要的作用,但是它在聯邦推薦中的應用還沒有得到充分的研究。","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"2、系統層面的挑戰","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":5},"content":[{"type":"text","text":"1) 召回和排序的設計","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"系統層面的主要挑戰是設計具有實時反饋的和隱私保護的召回和排序流程。傳統的推薦系統依次通過這兩個流程來給出最終推薦結果。傳統上的推薦系統集中收集用戶的隱私數據,這兩個步驟設計在系統的中央服務器上執行。但是,考慮到用戶隱私,聯邦推薦系統應該對應修改原來的設計。我們討論兩個極端情況。第一種情況是服務器側召回 + 參與方側排序,首先,各方向服務器發送加密的“噪聲”模型參數,然後在服務器端執行召回流程,召回中的前 N 項被髮送回每個參與方,然後,在每個參與方上啓動排序流程,這種方案存在隱私泄露的可能性,因爲服務器知道召回的確切結果;第二種情況是參與方側召回和排序,服務器將所有項目的屬性和內容發送給各個參與方,然後,在參與方側執行整個召回和排序過程,這種設計不存在用戶隱私泄露,但會造成大量的通信開銷,另外,它需要大量的本地計算資源和存儲空間,然而,隨着近年來 5G技術的快速發展,通信成本問題可以在一定程度上得到緩解。","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":5},"content":[{"type":"text","text":"2) 通信損耗","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"通信損耗是影響聯邦學習性能的主要原因之一。由於推薦系統的特徵高維度特性和實時性要求,聯邦推薦系統中的通信成本問題將會非常嚴重。","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":5},"content":[{"type":"text","text":"3) 靈活性和可擴展性","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"隨着參與方數量的不斷增加,如何設計更好的模型並行和模型更新調度方案來保證聯邦推薦模型的收斂性將成爲一個挑戰。許多聯邦學習系統採用的同步的客戶端-服務器體系結構,不利於靈活擴展。在推薦系統中,數百萬用戶使用推薦服務。同時訪問的參與方太多會使中央服務器上的網絡擁塞,很難保證所有的參與方都能參與聯邦訓練的整個過程。因此,聯邦模型的性能會受到嚴重影響。","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":5},"content":[{"type":"text","text":"4) 數據非獨立同分布問題","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"“長尾”現象在推薦系統中普遍存在,使得數據非獨立同分布問題在聯邦推薦系統中變得不可避免。由於非獨立同分布數據的高度傾斜,聯邦推薦模型的表現將會嚴重下降。隨着各參與方數據分佈之間的距離越來越大,模型的準確率也會相應降低。","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":5},"content":[{"type":"text","text":"5) 惡意參與方合作","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在現實中,推薦系統中的參與方很有可能不值得信任。這些參與者並不遵循經常使用的假設(參與放和中央服務器都是半誠實的)。它們可能在梯度收集或參數更新中表現不正常,而服務器也可能是惡意的。因此,誠實的參與方在這些情況下可能會有隱私泄露的風險。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"相關文章:","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"link","attrs":{"href":"https://www.infoq.cn/article/VmS3QHkOlDNa3ks7yNqJ","title":"","type":null},"content":[{"type":"text","text":"GPU 在聯邦機器學習中的探索","attrs":{}}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"link","attrs":{"href":"https://www.infoq.cn/article/iHaL9CagG9fY7algmpuy","title":"","type":null},"content":[{"type":"text","text":"同態加密技術及其在機器學習中的應用","attrs":{}}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"link","attrs":{"href":"https://www.infoq.cn/article/X3WSEhlXs1uL9ue1kkWO","title":"","type":null},"content":[{"type":"text","text":"性能提升最高達 25 倍!新型分佈式機器學習訓練加速方案 RAT 技術解讀","attrs":{}}]}]}]}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章