阿里定向廣告智能投放技術體系

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"隨着經濟數字化地快速發展,互聯網廣告在賦能商家營銷、幫助消費者高效瞭解商品和服務、以及商業平臺的變現等方面扮演着越來越重要的角色。互聯網廣告生態中,廣告主通過付費在媒體上對目標用戶進行營銷信息傳達來完成營銷過程。其中,廣告主通常希望在有限的資源投入下最大化營銷效果。然而流量環境、其他參競廣告形成的競爭環境的複雜性、以及廣告投放策略中出價、目標人羣、資源位、投放時間等變量的巨大組合複雜度,使得最優廣告投放策略的計算與執行充滿了挑戰。通過本文,我們將從更好地幫助廣告主做營銷的視角出發,比較系統地介紹阿里媽媽定向廣告團隊基於廣告主投放需求不斷技術創新,建立起的一套持續演進的廣告智能投放體系,具體包括預算約束下的報價策略、多約束下的報價策略、合約保量報價策略、基於長期價值的序列投放報價策略、跨渠道智能投放策略等核心技術能力的算法與實踐經驗。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"背景介紹"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"今天主要分三個部分,第一部分先介紹廣告業務的背景,第二部分會沿着如何幫助廣告主更好地做好營銷的歷史脈絡,講解智能投放體系和技術的演進過程,最後會給出整個演進的總結,以及對未來的展望。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"1. 廣告業務背景概述"}]},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/26\/26aa93b47e424076141170931fcaed16.png","alt":"圖片","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"首先廣告是什麼,我把它抽象成了這樣三個部分,在廣告生態裏,包括廣告主,媒體(或者廣告平臺)和用戶。廣告的邏輯就是廣告主在媒體上付費,通過媒體對目標用戶進行營銷信息傳達,影響用戶,讓用戶購買廣告主的商品,形成信息和金錢的流動,也形成用戶廣告之間物質的流動。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"作爲一個廣告平臺來說,在中間起到了一個很重要的橋樑作用。一方面是要能夠給廣告主提供足夠多的功能,讓廣告主表達訴求,然後廣告主也要通過合適的方式向媒體付費。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"另一方面,媒體還要基於廣告主設定的策略,決定把廣告投放給誰。所以媒體在中間起到橋樑的作用,要設定好廣告對接機制,向廣告主售賣的機制,以及把廣告主的物料用怎樣的投放策略投放給用戶。"}]},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/11\/1196385f1fa5b2fbf42dc9e571f1cfdc.png","alt":"圖片","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"爲什麼需要這個機制?因爲流量類型多種多樣,售賣的時候,既可以按點擊去賣,也可以按展現去賣。賣廣告的時候,可以賣給品牌類廣告主,或者賣給效果類廣告主。另外賣流量的時候,怎麼去收費呢?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"其實這些問題都是需要去想清楚的,因爲只有把這些事情想清楚了,這個商業模式才能夠確定下來,才能夠迎合當前市場的整體需求。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"這裏要考慮的角度是兩方面,一方面就是媒體自身業務,要思考它的發展,然後去設計符合媒體自身的機制,包括商業化率,如何去介入,整個坑位的佈局,出價,溢價等等。另一方面就是廣告主方面機制的考慮。媒體以什麼樣的方式給廣告主定價?廣告主是手動類的出價?還是以合約的方式,讓平臺來去幫廣告主計算出價等等。這兩方面都有非常多的概念要去考慮,才能把商業模式確定下來。"}]},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/d4\/d45a83d0e46223e8160221f06a17a147.png","alt":"圖片","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"除了考慮機制之外,還要去思考這個平臺如何幫助廣告主選擇合適的廣告投放策略。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"場景一:CPM計劃"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"有的客戶說我要以50元的CPM出價,在手機淘寶上首猜1坑投放3天,預算沒有上限。這是一個以PV曝光爲出價的方式購買廣告,收費也是要按PV的方式計費。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"場景二:合約計劃"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"有的廣告主要以60元的總價格在1天內購買1000個展現。廣告主不關心單條流量的情況,只關心整體的投入和產出。在這樣的設定下,平臺就一定要保證廣告主的投入和產出結果,這種就是合約類的計劃。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"場景三:序列投放計劃"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"這是我們今年的一個創新:有的客戶說要投一個廣告計劃,這個計劃可能投放很長時間,對用戶可能會產生多次的曝光和持續的影響,希望通過這種持續的影響,能夠優化累計轉化數。這種在廣告策略上不一定追求短期的收益最大化,而是要去追求一個長期累積的收益最大化。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"所以這裏會迎合不同廣告主的訴求,平臺需要具備給出不同的廣告投放策略的能力,提供一個成體系的工具,服務好廣告主。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"2. 廣告業務迭代的歷史脈絡"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/09\/09917aa0290ac97aa275081473025d1b.png","alt":"圖片","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"阿里媽媽的廣告體系也經過了很長時間的積累和迭代。從2016年開始,在售賣機制\/投放方式上,從最樸素的CPM\/CPC類的廣告到衍生出來在流量粒度上能夠去預估價值,並且基於預估出來價值去對流量粒度做出價調整,衍生出來OCPM\/OCPC這樣能力。爲了保證廣告主的結果的確定性,又有了合約保量的能力。這些CPC\/CPM\/OCPC\/OCPM其實還是依託於廣告主手動出價,它有運維的成本。在這個基礎上我們又做了升級,讓廣告主只表達一個整體的投入和結果的預期,這樣平臺能夠結合預期去自動的幫廣告主在每條流量上出價,所以又有了BCB ( Budget Constrained Bidding ),就是預算約束下自動出價和MCB ( Multi-Constrained Bidding ) 就是多種約束下的出價。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我們今年有一個新的工作就是叫Multi-Channel Sequential Budget Constrained Bidding。我們可以跨各種渠道,通過序列的投放廣告影響用戶,然後形成預算約束下的效果最大化,所以整個發展的脈絡的能力是不斷升級。從優化目標上,我們也是從最早的優化通用的展現\/點擊,到優化成交\/ROI,到優化電商系統裏面用戶對商品的加購\/收藏\/關注以及任意的後驗目標,包括從短期價值到任意的長期價值的優化,所以優化目標能力也是在不斷地演進的。我們這些相關重要的工作,也在國際會議上有論文的發表,包括KDD,CIKM,ICML等等。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"3. 廣告業務技術能力總結"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/59\/59d9da4ea89e0686cb9250fc0f682bc8.png","alt":"圖片","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"這裏把我們積累的能力做下總結。最早的時候,廣告主在廣告平臺上樸素地按照PV或者點擊出價。這種出價其實是一視同仁,比如所有 PV都值1元,所有點擊都值0.5元。在這個基礎上,我們做了OCPM\/OCPC的升級以後,每個PV\/點擊的價值就可以去做實時地預估。做了預估以後,比如廣告主認爲一個點擊值0.5元,而我們認爲點擊的價值更高,就可以把0.5元調整到比如說0.7元。如果這個點擊價值低,可以幫廣告主把0.5元調整到0.3元,所以就實現了流量粒度的價值預估和優化。進而再去升級到BCB,對結果也能有一個提前的確定性保障。考慮到流量波動是很大的,尤其是像雙十一或者618這樣的大促,廣告主對銷售KPI和最終結果的保障性有強烈的要求,他在這一天必須把這錢花出去,拿到這麼多流量。我基於對過去的流量和未來流量的規律把控,能夠幫助廣告主去保證它的效果,就迎合了廣告主的訴求。預算約束下綜合出價,就是幫廣告主從手動擋升級到自動擋。多約束下自動出價,就是我們除了幫助廣告主去滿足預算的約束,有些廣告主他不希望自己的一個PV或者一個點擊的花費太高,所以他會提出一個點擊必須在1元之下,一個PV必須在0.1元之下,這種其他的額外約束,在我們平臺現在也是可以表達的。表達了之後,這些約束我們都會幫他滿足,然後具有一個相應的自動出價的能力去把這個計劃投放出去。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"到了2019年的時候,我們也推出了合約與RTB(real time bidding)環境的優化。這個合約就是我們要去提前保障廣告主結果的確定性。但是這種合約在很多廣告平臺裏跟RTB實時競價是隔離開的。合約就是這份流量約定好就這麼賣了,而實時競價就是按實時的競爭拍賣結果賣,但這種割裂的方式顯然是不符合市場經濟,而是用計劃經濟的方式把兩種模式強行分開,然後分別去做優化。最好的市場經濟方式是把合約跟RTB直接混合起來,然後用市場自己調整的方式去得到全局的最優化。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"但這個也帶來很大的挑戰,因爲RTB的波動,勢必對合約會產生一些不確定的影響,那麼合約還能不能保證廣告主的結果?並且合約和RTB混合起來,是不是能做到全局的收入的優化?所以這都是很有挑戰的問題。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"今年我們提出來序列投放優化長期價值,很多用戶看廣告不僅是看一次。但是很多廣告平臺有頻控,用戶只能看一次廣告,再來廣告曝光,就簡單粗暴地不允許用戶看了,用這種很生硬的方式進行頻控。但根據我們的統計會發現,在電商領域有很多用戶,第一次看廣告,是不會去購買的,但是第二次看了廣告,可能就購買了,而且可能超過50%的購買都是發生第二次廣告曝光或者第三次廣告曝光。這也就出現一個問題,有的用戶確實讓他只看一次是合理的,因爲他這次沒有興趣,以後就沒興趣。但是有的用戶給他多看幾次,他的心智會有加強,反而會形成一個累積的購買結果的最優。所以我們今年也推出序列投放,就是希望能夠去挖掘用戶的長期價值,讓廣告系統再去評估廣告投放或者不投放的時候,不再單一的看這一次ECPM單次的最大化,而是要看整個長期對用戶序列的運營是否能做到最優化,對序列化投放基於長期價值這個角度去優化廣告的投放。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"最後,我們今年也有一個很重要的產品叫AI智投,解決廣告主的一個很大的痛點。廣告主現在可以有很多的營銷平臺,去投品牌廣告,搜索廣告,還有定向廣告。所有這些廣告,廣告主從傳統的角度來看,只能把自己的廣告預算按照過去的經驗去分拆到這三種不同的渠道上面,然後在不同渠道內部,再做相關的人羣設定、出價設定方面的優化,但是不同的渠道之間怎麼做整體優化呢?這也是一個很重要的問題,所以我們從單渠道的優化走向各種各樣多渠道的全局優化。整個所有的功能升級,在接下來的智能投放體系和技術裏面,我會一個個給大家做一個簡要的介紹。今天我們也重點介紹背後的算法核心思想,因爲我們大部分的工作都是有相關論文的發表,所以對於裏面一些數學的細節,大家可以去看論文。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"智能投放體系和技術"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"1. OCPC\/OCPM"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"① 廣告主需求"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"廣告主說:“我已經習慣了CPC,在此基礎上,能幫我做點別的嗎?”      "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在這個基礎上,我們可以做Optimized Cost Per Click ( OCPC )。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"② 優化方案"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/84\/84057157d66f5d4001c299cb9095bcdc.png","alt":"圖片","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"第一,廣告主對CPC出價,是所有點擊一視同仁的,但不同的點擊其實價值是不一樣的,按統一的出價其實很不合理。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"第二,廣告主有時候有多維度的優化訴求,不單單隻想拿更多點擊,可能還有點擊之後的轉化價值,比如說購買\/加粉,這些訴求廣告主也是希望優化的。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"那麼基於這兩點,OCPC就可以在廣告拍賣機制下藉助系統的能力進一步優化客戶價值。那麼它是怎麼做到的?我們看在GSP拍賣機制裏面,傳統的廣告排序依賴的分數叫ecpm,那麼它的在CPC廣告裏面它的計算公式是 ecpm=org_bid(廣告主的原始出價)*pctr(這條流量的點擊率),然後就會按照這個做排序。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在激勵兼容的機制裏面,廣告主的原始出價是基於一段時間一個流量的平均價值基礎出價的。廣告主可能上午10點開了這個計劃,他認爲從上午10點到晚上10點這12個小時每個點擊的平均價值就是0.9元,所以他就出了一個0.9元。這是基於一段時間的流量的平均的評估,顯然是有缺陷的。在平均的意義上,廣告主的出價跟流量價值確實達成一致了,但是在單條流量上,對廣告主的價值有可能是有差異的。比如廣告主如果關注的是點擊之後的轉化價值,那麼有的點擊背後是能夠形成高轉化的,有的點擊背後是不能形成高轉化的,因此不同的點擊背後價值就不一樣,相應的出價也應該不一樣的。造成缺陷的一個主要原因是廣告主不能對單條流量去評估價值,因爲他不能鑽到我們的系統裏面來觀察每一個流量的價值,靠人肉是做不到這一點的。所以改進的辦法就是我們要對廣告主原始出價做一個調整,把它變成一個叫做optimized_bid,那麼optimized_bid就是在廣告主的原始出價org_bid的基礎上去乘一個因子,這個因子是pValue\/baseValue。分子pValue就是在單條流量上,我們利用系統裏面的機器學習模型,能夠實時地去感知流量用戶的信息和廣告的信息,然後利用這兩個信息去給出單條流量的價值。它放在分子這個位置,說明pValue如果高,流量價值就高,可以往上調一點價格。如果pvalue低,那麼這條流量價值低,可以往低調一點價格。pValue是需要有一個錨定點的,錨定點就底下的baseValue,baseValue是基於歷史的數據去評估廣告主表達的流量的平均價值,比如它過去所有這些流量的平均的轉化率是多少,或者平均的關注率是多少。這裏pValue和baseValue是一個點擊後的價值,這個客戶是需要通過在平臺上去表達的。比如說他是想去優化轉化,或者想去優化收藏,或者想優化關注。有了這樣的一個基於pvalue和basevalue對org_bid的調價,在我們新的ECPM排序機制下,就可以用optimized_bid的去做排序。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"③ 效果評估"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在這個情況下,廣告主和平臺是雙贏的。對廣告主來說,相同預算下,價值回報上漲,ROI提升了。因爲他以前買這個東西都是一視同仁的,現在可以選裏面最好的東西去買,除此之外也給廣告主提供了多維度的價值優化,比如優化轉化,或者優化關注。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"對於平臺來說,廣告主ROI提升了以後,他肯定覺得在這個地方投資回報率高,應該投入更多的預算,然後獲得更多的回報。那麼廣告主意願的加強,也會對平臺帶來收入的增長。所以整個OCPC\/OCPM也是很類似的一個思想,它的核心理念就是這樣。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"2. BCB(Budget Constrained Bidding )"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"① 廣告主需求"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"廣告主手動出價之後,我們怎麼幫助廣告主把手動的出價的價值精細化到流量的力度?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"有的廣告主說自己還要去評估平均價值,運維成本很高,而且廣告主經常有很多商品,要開很多計劃,計劃裏面又有非常多的定向人羣,或者是買詞,那麼每個人羣或者詞上都要去調價,操作成本就非常高。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"所以廣告主能不能就花這麼多錢,然後幫他達成一個最優化的目標?比如“我每天可以花1000元,能幫我買到儘量多的粉絲嗎?”。廣告主不想去關心具體怎麼出價,讓平臺去幫助他去自動出價。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"② 優化方案"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"針對這個需求,我們推出了相應的產品叫Budget Constrained Bidding,就是在預算約束下自動地出價。這裏我們給出一個生動的例子,讓大家看一下它是一個什麼樣的概念。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/00\/00518427675aeb0510ea4f0c19c30b4c.png","alt":"圖片","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"比如我們的目標是在廣告主有六美元的預算約束下,要儘可能去獲得足夠多的點擊。底下橫座標軸是時間,就是廣告計劃,比如今天凌晨0點開始投放了,到今天24點結束,那麼在這24個小時裏面,一共有這麼多藍點流量讓廣告主去競爭。這些藍點代表的流量的市場成交價都不一樣,有的是1美元,有的是2美元,有的是3美元。對於廣告主來說,爲了獲得更多的點擊,即想要更多的藍點兒,他應該怎麼去出價呢?廣告引擎裏面的“小機器人”,根據自己的算法,就會沿着時間,對於每一條流量會做判斷這個藍點要還是不要。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"小機器人看到第1條流量的成交價格是1美元,覺得這是一個很高的性價比。因爲1美元就可以拿到一個點擊,所以它就會把它拿下,之後預算就從六美元變成了五美元。"}]}]}]},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/c9\/c9f0db8268a038f790d40a3006e13889.jpeg","alt":"圖片","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/43\/433509f3651d2fe5452a660b3e8921d2.jpeg","alt":"圖片","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/02\/02c286bccc54c08f641f38b9443f3214.jpeg","alt":"圖片","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"然後接下來他又遇到第2個流量,這個點擊要花3美元,性價比就非常低,決定不要這個流量了,那麼紅色就不競得。"}]}]}]},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/dc\/dc8569bb69387c05f4b83e01632ceef7.jpeg","alt":"圖片","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/13\/13566bb7c917f9823bf09a4520e30a91.jpeg","alt":"圖片","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"再接下來它又看到一個2美元的點擊,覺得性價比還不錯,所以這個也拿下。"}]}]}]},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/e0\/e0275a4ddc5174846eefb2d32646741a.jpeg","alt":"圖片","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/43\/4326b95e3cb3b413255b17a49c0cef9d.jpeg","alt":"圖片","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/c6\/c604aaa5a50b9207e005c1167ad54e22.jpeg","alt":"圖片","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"然後我們的預算就進一步減小,現在只剩3美元的預算了。然後又有一個低成本的流量,拿下。"}]}]}]},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/c7\/c7f37765bc4a42c577a7ebb2182f3024.jpeg","alt":"圖片","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/a4\/a42fb9a20c3e1cb74f893a550b0eb845.jpeg","alt":"圖片","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/f0\/f052116a03d1cb14d65a8bc0a6b87b82.jpeg","alt":"圖片","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"然後第5個藍點的價格很高,不要。"}]}]}]},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/b7\/b7eb7ffb6e4a224e86ae4e3e29421f31.jpeg","alt":"圖片","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/a6\/a614977b6a8f2e4f503774cf80d69cae.jpeg","alt":"圖片","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"然後,第6個藍點又拿下。"}]}]}]},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/6f\/6f01d2a0a42eb464ab8766c5c2c9499b.jpeg","alt":"圖片","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/2f\/2f393a458fe25dce4146ee4a36c5ac76.jpeg","alt":"圖片","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/c7\/c72a86680a88f13c0bd1d1e9a3bbc662.jpeg","alt":"圖片","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"最後一個不要。"}]}]}]},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/50\/50a43b6ee16d6be352794540e4266f10.jpeg","alt":"圖片","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/87\/8720b37de574655573f10562a246d604.jpeg","alt":"圖片","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"這個時候今天所有的流量也就結束了,預算也花光了。"}]},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/d5\/d5930f016f83b0df64aced05c6f928cf.jpeg","alt":"圖片","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"從結果來看,我們這個算法把所有1美元和2美元的點擊全部都拿到了,3美元點擊都扔掉了。從一個點擊上的成本來看,它把性價比最好的流量全都拿到手裏面了。所以這就是budget-constrained-bidding的概念,具體它的算法是怎麼去設計的呢?"}]},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/f9\/f98299677d0e8ebf87108c880c2b08ca.png","alt":"圖片","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"首先我們要形式化這個問題,在給定預算b的前提下,要儘量拿到最多的價值。比如,這裏要拿到最多的點擊,那麼在競價廣告裏就變成了系統要對每條流量出一個報價,讓最終的目標函數最大。底下的maximize的b"},{"type":"sub","content":[{"type":"text","text":"i"}]},{"type":"text","text":"是對每條流量上的出價 ( i指的是零到24小時所有的流量 )。這裏的“1”是一個示性函數,c"},{"type":"sub","content":[{"type":"text","text":"i"}]},{"type":"text","text":"是這條流量的市場成本 ( 即市場成交價 )。如果b"},{"type":"sub","content":[{"type":"text","text":"i"}]},{"type":"text","text":"出價大於市場成交價,那麼這條流量就競得了,這個示性函數就取值爲1。如果出價小於市場成交價,就沒有競得,示性函數就是0。旁邊v"},{"type":"sub","content":[{"type":"text","text":"i"}]},{"type":"text","text":"是這條流量的價值。如果i定義的是PV粒度的流量的話,這是一個CPM廣告的建模方式,這裏每一個示性函數取1,就說明曝光PV拿下了。v"},{"type":"sub","content":[{"type":"text","text":"i"}]},{"type":"text","text":"就可以理解爲曝光的價值,比如說v"},{"type":"sub","content":[{"type":"text","text":"i"}]},{"type":"text","text":"如果是點擊價值,那麼用戶點擊了,v"},{"type":"sub","content":[{"type":"text","text":"i"}]},{"type":"text","text":"就是1,如果沒點擊,v"},{"type":"sub","content":[{"type":"text","text":"i"}]},{"type":"text","text":"就是0。底下的約束也是一樣的。對於所有流量來說,每一條競得的流量對應的c"},{"type":"sub","content":[{"type":"text","text":"i"}]},{"type":"text","text":"是市場成交的價格,所有流量市場成交價格加起來,不能超過總的預算。所以這個就是預算約束下的價值最大化的建模。如果我們把這個示性函數當做一個變量,比如說把它當做x ( x可以取0或1 ),它其實就是一個線性的優化問題。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在優化問題方面,張偉楠老師的參考文獻裏面從連續空間做了分析,最後得出這個問題的最優出價公式b"},{"type":"sub","content":[{"type":"text","text":"i"}]},{"type":"sup","content":[{"type":"text","text":"*"}]},{"type":"text","text":"=v"},{"type":"sub","content":[{"type":"text","text":"i"}]},{"type":"text","text":"\/λ"},{"type":"sup","content":[{"type":"text","text":"*"}]},{"type":"text","text":"。他的分析辦法,是把它看作一個原問題,然後用拉格朗日的辦法把它轉換成一個對偶問題,然後對對偶問題去做分析,得到最優的出價,就等於單條流量上的價值除以一個固定的參數λ"},{"type":"sup","content":[{"type":"text","text":"*"}]},{"type":"text","text":"。大家要注意這個地方的λ"},{"type":"sup","content":[{"type":"text","text":"*"}]},{"type":"text","text":",就是0點到24點的24個小時裏,對所有的流量λ"},{"type":"sup","content":[{"type":"text","text":"*"}]},{"type":"text","text":"都是一個固定的值,然後v"},{"type":"sub","content":[{"type":"text","text":"i"}]},{"type":"text","text":"就是每一條流量的價值。所以它的直觀的含義就是:對於任何一個流量來了,流量的價值高,我出價就高,價值低,出價就低,出價跟價值完全是正比例的線性關係。線性關係的係數就是1\/λ"},{"type":"sup","content":[{"type":"text","text":"*"}]},{"type":"text","text":",λ"},{"type":"sup","content":[{"type":"text","text":"*"}]},{"type":"text","text":"可以通過歷史數據,對線性問題直接去求解。這個求解也是非常容易的,直接能得到一個最優λ"},{"type":"sup","content":[{"type":"text","text":"*"}]},{"type":"text","text":",但是實際情況其實會更復雜一些。我們可以從下圖看出,在實際廣告投放的時候,在一天的24小時裏面,首先,流量的波動是很大的。其次,我們可以看出不同小時的出價波動也很大。我們再看紅線跟藍線,天與天級別之間它的差異也很大。所以根據歷史數據,雖然利用規劃類的軟件就能把λ"},{"type":"sup","content":[{"type":"text","text":"*"}]},{"type":"text","text":"解出來,但它不一定在明天還能夠適用。所以λ"},{"type":"sup","content":[{"type":"text","text":"*"}]},{"type":"text","text":"求解既要結合過去的數據,也要結合當天實時的數據,這個解法一般可以用反饋控制的方式去解,也可以用一些強化學習的方式去解。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"λ"},{"type":"sup","content":[{"type":"text","text":"*"}]},{"type":"text","text":"的含義是什麼?其實它是一個性價比的閾值。我們看到b"},{"type":"sub","content":[{"type":"text","text":"i"}]},{"type":"sup","content":[{"type":"text","text":"*"}]},{"type":"text","text":"是出價,如果按照廣告的二價計費,小於b"},{"type":"sub","content":[{"type":"text","text":"i"}]},{"type":"sup","content":[{"type":"text","text":"*"}]},{"type":"text","text":"的最後市場成交的流量都會被拿下。我們可以看到λ"},{"type":"sup","content":[{"type":"text","text":"*"}]},{"type":"text","text":"=v"},{"type":"sub","content":[{"type":"text","text":"i"}]},{"type":"text","text":"\/c"},{"type":"sub","content":[{"type":"text","text":"i"}]},{"type":"text","text":",v"},{"type":"sub","content":[{"type":"text","text":"i"}]},{"type":"text","text":"是流量的價值,c"},{"type":"sub","content":[{"type":"text","text":"i"}]},{"type":"text","text":"是流量的成交價成本,v"},{"type":"sub","content":[{"type":"text","text":"i"}]},{"type":"text","text":"\/c"},{"type":"sub","content":[{"type":"text","text":"i"}]},{"type":"text","text":"其實就是性價比。那麼處於λ"},{"type":"sup","content":[{"type":"text","text":"*"}]},{"type":"text","text":"這個性價比之下的這些流量 ( 每一個藍點都是一個流量 ) 都會被競得,因爲這些流量的最後的c"},{"type":"sub","content":[{"type":"text","text":"i"}]},{"type":"text","text":"一定是小於b"},{"type":"sub","content":[{"type":"text","text":"i"}]},{"type":"text","text":"的。那麼處於λ"},{"type":"sup","content":[{"type":"text","text":"*"}]},{"type":"text","text":"之上的就一定會被扔掉,因爲他們的c"},{"type":"sub","content":[{"type":"text","text":"i"}]},{"type":"text","text":"是大於b"},{"type":"sub","content":[{"type":"text","text":"i"}]},{"type":"text","text":"的。所以按照b"},{"type":"sub","content":[{"type":"text","text":"i"}]},{"type":"sup","content":[{"type":"text","text":"*"}]},{"type":"text","text":"=v"},{"type":"sub","content":[{"type":"text","text":"i"}]},{"type":"text","text":"\/λ"},{"type":"sup","content":[{"type":"text","text":"*"}]},{"type":"text","text":"的公式去出價,最後的結果就相當於把全天流量裏面性價比最高的這部分流量都給競得了,然後性價比低的流量就全部扔掉了。不管是你用反饋控制,比如大家熟悉的PID的算法或者強化學習控制,它背後核心就是在找λ"},{"type":"sup","content":[{"type":"text","text":"*"}]},{"type":"text","text":",要在所有波動的環境裏面去把最好的λ"},{"type":"sup","content":[{"type":"text","text":"*"}]},{"type":"text","text":"能夠穩定地找到。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/e5\/e5bc752b6d8ad3a36f85935ebaa0c953.png","alt":"圖片","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我們團隊的吳迪老師在2018年的時候發表過一篇論文,用強化學習解決前面提到的整個流量波動的問題。強化學習建模裏,消費者的流量不斷請求廣告引擎,構建了不確定的交互環境。模型包括Agent ( 智能體 ),就是我們的廣告算法。模型的狀態指的是廣告計劃 ( 比如當前的一些預算,或者歷史的平均CTR\/CVR,以及包括剩餘沒有花費的預算等等 )。智能體的action,對λ去做調整。DQN可以設計成兩個輸出端子,一個端子是對λ上調一個α"},{"type":"sub","content":[{"type":"text","text":"t"}]},{"type":"text","text":",另一個端子對λ下調一個αt。當然你也可以把它設成A2C、A3C等等,設計成一個連續函數的action輸出。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"總而言之,action這邊調λ,然後整個建模我們是用一個model free的方式去做,一個time step可以認定爲15分鐘,那麼一個time step裏面所累積的所有的回報的value,比如所有的點擊量或者所有的購買量,當做reward。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在這樣的一個強化學習的設定下,我們搭建了一個離線的訓練平臺。在離線訓練平臺下,我們可以把過去的日誌做一個回放,依託強化學習的建模,用DQN或者A3C等模型去訓練。在線的時候,就把訓練好的模型部署上去。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"實際做的時候會發現一個問題,這種預算約束下,讓一個強化學習的智能體去探索的時候,有可能出價高一點,就能在某個time step拿到非常大的價值。因爲這個也是顯而易見的,你出價高,拿的流量多,當然價值就多。但是因爲是有預算約束的,強化學習算法可能會比較短視,在短期內它可能沒有觸達預算的上限,就拼命地去拿流量,覺得現在的reward很高就很好。但是算法跑到中間的時候發現預算花光了,後面的流量都拿不到了,這樣的話這個算法就相當於是探索出來一個次優的路徑。但是強化學習的整個序列探索,想讓它自己去不斷探索,探索出前面省喫儉用,到了後面剛剛好,又能最後把錢正好花光的算法或者路徑,這其實是非常難的,所以在reward的設計上就要去巧妙地構思。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我們組的同事也提出了一個方法,實踐中既能保證效果,而在理論上也是非常棒的。把它的每一步獲得的reward都跟整個序列獲得的整體的reward做了一個關聯,這裏面RT它不再是一個time step 15分鐘的結果,它是把整個序列的reward都能夠反映到裏邊,但是它反映過來以後還給出了一個理論的證明,叫reward shaping。在shape以後的reward情況下,我們得到最優策略跟原始嚴格的強化學習建模定義的最優策略,是嚴格一致的,所以這個reward極大的加速了這個問題的收斂。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"③ 效果評估"}]},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/cb\/cb704609de109fbaf4f12d4f5f977c06.png","alt":"圖片","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我們可以看到這張曲線圖,底下紅線就是按照最原始的reward去訓練,訓練了很多time step,它的reward一直處於很低的,收斂得非常慢。但是如果加了我們的reward-shaping之後,它reward就上升得非常快,收斂得非常好。然後左邊這張圖,我們用基於強化學習捕捉流量變動,出價變動,天級別之間變動的方法,叫做DRLB(Deep RL Bidding),加粗的線也比其他傳統的方法要好。這些傳統的方法包括FLB(fix linear bidding),用過去的歷史日誌,得出過去認爲的最優的λ"},{"type":"sub","content":[{"type":"text","text":"0"}]},{"type":"text","text":",放進來,然後還有一些啓發式的規則的,比如BSLB等等。用強化學習去做能得到一個非常好的效果。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"3. MCB"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"① 廣告主需求"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"剛纔提到的是預算約束下的情況。廣告主除了希望預算約束,還希望點擊成本也不要太高。比如說:“我每天可以花2000元,能幫我買到儘量多的成交嗎? P.S.根據經驗我希望點擊成本不超過1.5元”。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"② 優化方案"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/3e\/3e73d3e08ed4b60c84c7d1fbf10790f7.png","alt":"圖片","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在這個問題裏面,我們的建模和剛纔類似,目標函數還是要最大化整個流量的value,約束裏面第一項還是預算的約束。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"第二項就是某種成本下的約束。這裏的分子是所有流量的最後成交價格累積起來,相當於是總的投入(即總的消耗),分母的w"},{"type":"sub","content":[{"type":"text","text":"i"}]},{"type":"text","text":"就是某種單位。比如說這個裏面是PCTR,分子是總消耗,分母是總點擊量,得到的就是一個平均的點擊成本。廣告主希望平均一個點擊,成本不能超過1元錢,這個約束就會讓平均點擊成本不會超過c"},{"type":"sub","content":[{"type":"text","text":"w"}]},{"type":"text","text":"。該問題也用對偶的辦法,可以分析出來它的最優的出價形式如下:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/a1\/a164c518c091e075cab00a8111c5726e.png","alt":"圖片","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我們團隊的楊迅同學,在KDD發表了一篇論文給出了這個對偶公式的推導,大家可以去參考。公式的第一項乘以v"},{"type":"sub","content":[{"type":"text","text":"i"}]},{"type":"text","text":"的,關注的是這條流量的價值。第二項是w"},{"type":"sub","content":[{"type":"text","text":"i"}]},{"type":"text","text":"*C"},{"type":"sub","content":[{"type":"text","text":"w"}]},{"type":"text","text":",關注的是這條流量出價對於平均成本的影響。所以這裏面的拉格朗日的對偶的變量p"},{"type":"sup","content":[{"type":"text","text":"*"}]},{"type":"text","text":"、q"},{"type":"sup","content":[{"type":"text","text":"*"}]},{"type":"text","text":",它就起到了去權衡這兩個項目的作用。p"},{"type":"sup","content":[{"type":"text","text":"*"}]},{"type":"text","text":"、q"},{"type":"sup","content":[{"type":"text","text":"*"}]},{"type":"text","text":"一旦求得以後,那麼就在一天裏面都是固定不變的,它的解法也可以包括兩種。一種是基於強化學習的解法,跟剛纔我們提到的很類似,可以把一些reward shaping的思路借鑑進來,只不過在這個裏面要把第二個約束也考慮進來。可以在reward裏面把底下平均成本的約束PCW也加進來,如果最後違反了平均點擊成本,就給一個懲罰,如果沒有違反,就不給懲罰。通過這種對reword的干預,實現多約束的情況下的強化學習求解。還有一種方式就是基於反饋控制的辦法。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"經過我們的分析,我們發現p"},{"type":"sup","content":[{"type":"text","text":"*"}]},{"type":"text","text":"和q"},{"type":"sup","content":[{"type":"text","text":"*"}]},{"type":"text","text":",對於預算約束和平均點擊成本的約束的影響,是可以解耦控制的,即可以分別控制p"},{"type":"sup","content":[{"type":"text","text":"*"}]},{"type":"text","text":"和q"},{"type":"sup","content":[{"type":"text","text":"*"}]},{"type":"text","text":"從而分別去影響預算和平均成本,在理論上證明是可行的,然後就可以獨立地去搭兩個PID做控制。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"除了幫助廣告主滿足多約束,優化某個目標(優化,就是你不能保證結果)。那麼還有一類投放策略,既能滿足廣告主的預算約束,還能保證最後給多少曝光或者多少點擊,就是合約保量。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"比如廣告主說:"},{"type":"text","marks":[{"type":"strong"}],"text":"“我要以60元總價格在1天內購買手機淘寶首猜1坑1000個展現。我們可以事先簽訂合同,但完不成要給我賠償!我很好奇你們是怎麼分配流量的?”"}]},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/bd\/bd172abe7d5d8f1747dd981e6f8619c5.png","alt":"圖片","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"對於整個平臺來說,我們希望把合約保量的計劃跟RTB的計劃能夠混合起來,讓他們統一在市場經濟下,去競爭做優化。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"那麼這個問題是怎麼定義的?我們這裏優化的目標包含三項,第一項R"},{"type":"sub","content":[{"type":"text","text":"GC"}]},{"type":"text","text":"就是合約保量的目標,第二項Q"},{"type":"sub","content":[{"type":"text","text":"GC"}]},{"type":"text","text":"就是合約保量拿到流量的平均質量,第三項R"},{"type":"sub","content":[{"type":"text","text":"RTB"}]},{"type":"text","text":"是市場實時競價下的收益,我們希望把這三項加起來聯合去做優化。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"這裏R"},{"type":"sub","content":[{"type":"text","text":"GC"}]},{"type":"text","text":"是合約保量的收益是怎麼定義的?下面我們給出式子:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/56\/568ace4f4f1bd82908a8b8275f1144d3.png","alt":"圖片","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"c是保量的單位花費,d是要保多少量的需求。所以c"},{"type":"sub","content":[{"type":"text","text":"j"}]},{"type":"text","text":"*d"},{"type":"sub","content":[{"type":"text","text":"j"}]},{"type":"text","text":"按道理說就是你保量能夠達到的總收入的上限。但是你如果保不住量怎麼辦,那麼這裏p"},{"type":"sub","content":[{"type":"text","text":"j"}]},{"type":"text","text":"就是缺量的懲罰,你要賠廣告主多少錢呢?y"},{"type":"sub","content":[{"type":"text","text":"j"}]},{"type":"text","text":"就是你最後的缺量,所以p"},{"type":"sub","content":[{"type":"text","text":"j"}]},{"type":"text","text":"*y"},{"type":"sub","content":[{"type":"text","text":"j"}]},{"type":"text","text":"是最後如果沒有保數量,缺了這麼多量,要賠給廣告主的錢,所以R"},{"type":"sub","content":[{"type":"text","text":"GC"}]},{"type":"text","text":"就是合約這一塊的總收入。Q"},{"type":"sub","content":[{"type":"text","text":"GC"}]},{"type":"text","text":"這裏的λ"},{"type":"sub","content":[{"type":"text","text":"j"}]},{"type":"text","text":"是一個權衡因子,是固定常數,用來權衡合約保量的流量質量的重要程度。這裏的x"},{"type":"sub","content":[{"type":"text","text":"ij"}]},{"type":"text","text":"就是第i條流量對於第j個保量計劃,要不要去競得這條流量的示性函數。Q"},{"type":"sub","content":[{"type":"text","text":"ij"}]},{"type":"text","text":"這裏說的就是這條流量價值,比如這條流量的PCVR ( 轉化率 ),那麼Q"},{"type":"sub","content":[{"type":"text","text":"GC"}]},{"type":"text","text":"在這裏是爲了優化平臺收入時不能讓合約保量的這部分流量質量太差,不然廣告主它每次一買合約,發現最後合約的效果跟RTB效果差異非常大,合約的質量永遠很差,未來他就不會願意去買合約產品了。雖然你能保證確定性,但你買的都是很垃圾的流量,這是非常不合理的。所以Q"},{"type":"sub","content":[{"type":"text","text":"GC"}]},{"type":"text","text":"的功能就是要保障合約這部分的流量質量。R"},{"type":"sub","content":[{"type":"text","text":"RTB"}]},{"type":"text","text":"的b"},{"type":"sub","content":[{"type":"text","text":"i2"}]},{"type":"text","text":"就是最後RTB的二價,這個二價就是我們通常所理解的RTB最後競得帶來的平臺收入。"}]},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/ba\/ba47453194925dbed967ce9435a7396e.png","alt":"圖片","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"這個問題如果展開,其實也是一個線性的優化問題,用對偶分析,會發現對第i條流量,第j個合約計劃的最優出價就等於λ"},{"type":"sub","content":[{"type":"text","text":"j"}]},{"type":"text","text":"*q"},{"type":"sub","content":[{"type":"text","text":"ij"}]},{"type":"text","text":" +α"},{"type":"sub","content":[{"type":"text","text":"j"}]},{"type":"sup","content":[{"type":"text","text":"*"}]},{"type":"text","text":",α"},{"type":"sub","content":[{"type":"text","text":"j"}]},{"type":"sup","content":[{"type":"text","text":"*"}]},{"type":"text","text":"就是要求的最優對偶變量。根據前面的思想,我們也可以去給出一套強化學習的解決方案,或者大家也可以去嘗試用一些解耦的PID的控制方案。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"③ 效果評估"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/15\/15af98843734166807362f6de5f1d7c7.png","alt":"圖片","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"這張圖是我們用了一套強化學習的方法,能夠比傳統的一些固定的α"},{"type":"sup","content":[{"type":"text","text":"*"}]},{"type":"text","text":"的出價有更好的效果。這個裏面PID可能要注意必須站在一個全局的視角去做反饋控制。因爲這個時候涉及到的是很多個智能體在一起配合,所以這裏到底應該犧牲誰,應該保障誰?其實這個並不是一個嚴格的單增單減的關係,在這種情況下,你一定要有一個全局的調整,才能保證你最後得到α"},{"type":"sup","content":[{"type":"text","text":"*"}]},{"type":"text","text":"真的是面向全局最優的。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"4. MSBCB"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"① 廣告主需求"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"如果廣告主問:”有的用戶需要多次曝光纔會購買,預算有限下我應該如何投放?”"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"② 優化方案"}]},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/f3\/f3c1eb82bfeaa2e74c93088039ea64c5.png","alt":"圖片","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"如圖所示,有的用戶可能第一次訪問平臺它被曝光了,那麼他這個時候才稍微對這個商品有一些心智,那麼再去看廣告的時候,他就感興趣了,可能就加到購物車裏面了,那麼第三次看這廣告的時候,他最後就決定購買了。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"對於這種情況下,我們應該怎麼去優化廣告投放呢?這裏面也給出了一個建模的方案。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/d2\/d2b039257195c2dd919587bf8a25fcb1.png","alt":"圖片","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"這裏定義了V"},{"type":"sub","content":[{"type":"text","text":"C"}]},{"type":"text","text":"和V"},{"type":"sub","content":[{"type":"text","text":"G"}]},{"type":"text","text":"兩個概念。某一個固定的廣告,對於某一個用戶i,有一套運營用戶的策略π"},{"type":"sub","content":[{"type":"text","text":"i "}]},{"type":"text","text":"( π"},{"type":"sub","content":[{"type":"text","text":"i"}]},{"type":"text","text":"是指如果廣告跟這個用戶在這一週內比如能接觸5次,這5次裏應該選哪幾次去投放,哪幾次不要去投放 ),投放以後最後整個序列的整體預期回報就是V"},{"type":"sub","content":[{"type":"text","text":"G"}]},{"type":"text","text":"。那麼整個序列最後期望要投入的廣告的消耗是V"},{"type":"sub","content":[{"type":"text","text":"C"}]},{"type":"text","text":"。所以V"},{"type":"sub","content":[{"type":"text","text":"G"}]},{"type":"text","text":"是期望的長期的價值Long Term value,V"},{"type":"sub","content":[{"type":"text","text":"C"}]},{"type":"text","text":"是長期的成本Long Term cost,都依賴於這個廣告對這個用戶運營的策略π"},{"type":"sub","content":[{"type":"text","text":"i"}]},{"type":"text","text":"。優化目標也是顯而易見,類似於前面的budget constraint bidding,他的變量有兩方面。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"第一,要去選哪些用戶 ( 即 x"},{"type":"sub","content":[{"type":"text","text":"i "}]},{"type":"text","text":") ?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"第二,對於選定的這個人,我要用怎樣的策略π"},{"type":"sub","content":[{"type":"text","text":"i"}]},{"type":"text","text":"去運營用戶的序列?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"它其實整個把頻控這件事情包括進來了,那麼約束就是我要選哪些用戶,以及我對用戶的運營策略下,整個序列的所有的成本要小於等於預算。那麼最後既要優化選哪些用戶,又要優化對不同的用戶之間用怎樣的策略去優化它,這肯定是一個約束下的優化,又是一個序列決策的問題,它的解的組合空間非常大,求解起來也非常難。我們實際是怎麼求解呢?我們把它分成兩層求解。"}]},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/6b\/6bd24eae493e63f8c64e928ad0057020.png","alt":"圖片","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"第一,選用戶"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"假設已經找到了對用戶最好的運營策略π"},{"type":"sub","content":[{"type":"text","text":"i"}]},{"type":"text","text":",我們應該怎麼挑用戶?跟前面方式也是很類似的,依託性價比最優去選就好了。所謂性價比就是用戶的長期的價值\/用戶的長期的成本,然後對用戶從高到低去選擇就可以。在實際情況中,因爲流量粒度很細,所以這種揹包的貪心算法基本是可以取得99%比率接近最優解的。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"第二,運營用戶"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"當你知道怎麼去選用戶之後,對一個用戶你應該用什麼樣的策略π"},{"type":"sub","content":[{"type":"text","text":"i"}]},{"type":"text","text":"去運營它呢?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我們剛纔看到選用戶的時候,選性價比最優的用戶。直觀的一個想法是把每一個用戶的性價比運營到最大化是不是最優解呢?這裏我們給出了一個結論,不是最優解!"}]},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/27\/27e99e9184933a7cc1daa3a6ad21f66f.png","alt":"圖片","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"爲什麼呢?我們給了一個示意圖,這個示意圖你可以看作是個大揹包,揹包裏每一個長方形代表一個用戶。每一個長方形的高度,就是整個序列下來的總消耗,它的面積就是整個序列的總回報。那麼底下橫軸就是面積\/長方形的高度,得到的性價比。揹包已經按性價比從高到低,把用戶一個個疊羅漢的方式摞起來了,整個高度累加起來就是它的預算。整個按性價比優選的話,虛線之上的就相當於被擠到揹包之外的,外面的性價比較低的用戶就不要了,而虛線以下的性價比高,都已經裝進來了。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"這個時候我們來看最下面的藍色長方形,已經是把用戶運營到了性價比最高的狀態。如果對它的運營策略π"},{"type":"sub","content":[{"type":"text","text":"i"}]},{"type":"text","text":"做一些調整,讓性價比下降一些,底部藍色長方形的高度(即期望的成本)上漲一些,頭部低性價比的用戶就會被擠出去一些。二者一加一減的高度雖然相等,但是我們擠出去的是低性價比變化量,獲得了更高的性價比增量。通過犧牲最下面長方形用戶的性價比,從最極致的狀態稍微下調一點,整個揹包的容量就增加了。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我們也有一系列理論推導,最後能給出一個嚴格的理論證明。對用戶運營的時候,用戶運營的長期的return目標函數,要按照:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/a5\/a53b7e449ea26dfc803dacd433506587.png","alt":"圖片","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"去運營,那麼最後就能得到整個揹包的最大。而且我們兩階段的優化既要去挑用戶,又要去對單個用戶去做運營,這個來回迭代的算法,我們也嚴格證明了這個算法能夠收斂到全局唯一的最優解。"}]},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/ac\/aca6d635de6d76adf8953483a2bf897d.png","alt":"圖片","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我們的這個工作被ICML2020會議接收,感興趣的朋友可以去參考我們的論文。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"③ 效果評估"}]},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/06\/061e4c0776413e48e69b07300cf2d1d2.png","alt":"圖片","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"左圖中,當考慮到用戶跟一個商品反覆交互有這種序列效應的時候,表示長期價值效果的紅色的線是顯著高於表示短期價值效果的藍色的線。從右圖可以看出,紅色的線(代表目標函數策略)的確是比藍色的線(按照\"性價比最大化\"的思路運營用戶)也要高出一截。我們也跟傳統的Constrained 的一些方法去做了對比,在線上我們也取得了10%的ROI和GMV的增長。"}]},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/23\/236f7219b3fcb0dca3d47ee3582de7ba.png","alt":"圖片","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"5. CCSA"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"① 廣告主需求"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"上面介紹的很多都是在局部的一個渠道去投放。廣告主預算如何在搜索、推薦和品牌廣告渠道最優分配呢?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"② 優化方案"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"當廣告主面對不同的渠道的時候,怎麼能夠在整體上去做最優化?這個問題的建模可以看作i從1到N,廣告平臺有N種投放渠道,每個渠道都可以投入預算x"},{"type":"sub","content":[{"type":"text","text":"i"}]},{"type":"text","text":",你可以獲得渠道最後給你的回報f"},{"type":"sub","content":[{"type":"text","text":"i"}]},{"type":"text","text":"。那麼目標就是在所有渠道的預算x"},{"type":"sub","content":[{"type":"text","text":"i"}]},{"type":"text","text":"加起來小於等於總預算的情況下去最大化整體的渠道回報之和。廣告主面對各種複雜的廣告平臺操作,還要再去聯合優化,這種操作是很難實現、很難評估的。"}]},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/d8\/d8ed553cf58e9d56814fe4b47bfb90ce.png","alt":"圖片","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"所以我們可以把它集中到一個非常簡化的、操作便利的單一的操作接口上,廣告主只要去通過minimal的操作接口,就可以去設定投入,然後minimal的操作接口就可以控制所有的渠道去實現整個渠道的最優化。從整體視角來看,最終廣告主投入比如1000元的一點點邊際的預算,在各個渠道里帶來的增量的邊際回報是相等的,這個狀態就是最優的了。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/55\/92\/55e93bfc1663fdyy230dcd8acdebb592.jpg","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"但實際去做的時候,因爲各個渠道是很粗的粒度,你是很難去控制到各個渠道里面的一些具體的操作的,至少在系統的複雜性上會帶來很大的難度。所以我們目前階段性的方案是中心的minimal的平臺對各個渠道通過下發預算分配或者利潤要求等粗粒度的控制策略,然後去影響各個子渠道的投放策略,各個子渠道就可以去用BCB\/MCB或者序列投放等等去優化它們的策略,然後通過這種方式實現了一個大一統的優化,各個渠道之間的這種系統的耦合上也是比較低的。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"③ 效果評估"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"這裏給大家做一個宣傳,阿里媽媽今年推出了AI智投的投放產品,這比客戶自己去各個渠道去分預算投廣告,投放的效率能夠提升50%,這是一個很大的提升!"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"總結與展望"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"最後,總結下阿里媽媽廣告系統的迭代歷史,我們經歷了從粗放式價值評估,到精細的單條流量價值預估和優化升級,以及從單一到多維的目標的優化,從不確定到對廣告主有合約保量確定的結果保障,以及從廣告主自己手動調出價到廣告主只要設定預算,我們就自動幫它運維的這種升級,以及從單一的預算的約束,到包括各種成本約束(比如曝光成本,點擊成本約束)的控制能力,從短期的價值優化到長期的價值優化,以及從單個渠道的優化到全局所有渠道的整體優化。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"面向未來,我們會繼續深挖客戶的痛點,從數據智能,機制設計,算法升級和產品迭代方面進行全方位的體系和技術升級。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"分享嘉賓:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"靳駿奇 博士"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"阿里巴巴 | 算法專家"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"靳駿奇博士來自阿里巴巴集團精準定向廣告團隊,主要研究機器學習、機制設計在互聯網廣告與推薦系統中的應用。靳駿奇2007-2016在清華大學學習,獲得控制科學與工程學士、博士學位,以及清華經管學院經濟學第二學士學位。他在IEEE TPAMI,ICML,KDD,IJCAI,AAMAS上發表多篇學術論文。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"本文轉載自:DataFunTalk(ID:dataFunTalk)"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"原文鏈接:"},{"type":"link","attrs":{"href":"https:\/\/mp.weixin.qq.com\/s\/FWn9OTEzuy9T6BD1sk4Dog","title":"xxx","type":null},"content":[{"type":"text","text":"阿里定向廣告智能投放技術體系"}]}]}]}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章