Facebook是如何對視頻進行編碼的?

{"type":"doc","content":[{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"據 Facebook 2020 年第四季度財報披露,截至 2020 年 12 月,Facebook 日活躍用戶平均達到 18.4 億,全年月度活躍用戶達到 28 億,Facebook 應用家族(Facebook、Instagram等)月度活躍用戶平均達到 33 億。Facebook 的體量如此龐大,由此不難想象,該平臺要處理的視頻量級有多大。那麼問題來了,Facebook 是怎麼處理如此海量的視頻的?"}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"人們每天在 Facebook 上傳數以億計的視頻。爲了保證每一段視頻的傳輸質量(最高分辨率和儘可能少的緩衝),不僅要優化我們的視頻編解碼器以及壓縮和解壓視頻以便觀看,還要優化哪些編解碼器用於哪些視頻。但是,Facebook 上龐大的視頻內容也意味着要找到一種有效的方式來實現這一目標,並且不會消耗大量的計算能力和資源。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"爲解決這一問題,我們採用了多種編解碼器以及"},{"type":"link","attrs":{"href":"https:\/\/en.wikipedia.org\/wiki\/Adaptive_bitrate_streaming","title":null,"type":null},"content":[{"type":"text","text":"碼率自適應"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"技術(Adaptive Bitrate Streaming,ABR),根據觀衆的網絡帶寬選擇最佳的收看質量,從而提高收看體驗並減少緩衝。不過,儘管 VP9 等更先進的編解碼器提供了比 H264 等老式編解碼器更好的壓縮性能,但它們也會消耗更多的計算能力。如果把最先進的編解碼器應用於上傳到 Facebook 的每個視頻上,從純粹的計算角度來看,效率會很低。這就是說,我們需要找到一種方法來確定哪些視頻需要用更先進的編解碼器來編碼。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"現在,Facebook 通過結合效益 - 成本模型和機器學習模型,解決了對高質量視頻內容編碼的巨大需求,這使得我們能夠優先考慮對觀看率高的視頻進行高級編碼。如果能預測哪些視頻將會受到高度關注,並進行編碼,我們就能減少緩衝,提高總體視覺質量,讓 Facebook 上可能受到流量套餐限制的用戶觀看更多的視頻。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"但是,這項任務並不像讓最受歡迎的上傳者或者擁有最多好友或粉絲的內容跳到最前面那樣簡單。還有一些因素需要考慮進去,這樣我們就可以在 Facebook 上爲人們提供最好的視頻體驗,同時也保證了內容創作者的內容在這個平臺上仍然被公平地編碼。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"過去 Facebook 是如何編碼視頻的?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"傳統上,一旦視頻上傳到 Facebook,就會啓動 ABR 過程,原始視頻很快就會重新編碼成多種分辨率(比如 360p、480p、720p、1080p)。當編碼完成時,Facebook 的視頻編碼系統將嘗試使用更先進的編解碼器(如 VP9)或更昂貴的“菜譜”(recipe,視頻行業術語,用於微調轉碼參數),例如 H264 veryslow 的配置文件,儘可能壓縮視頻文件,從而進一步改善觀看視頻的體驗。各種轉碼技術(使用不同的編解碼類型或編解碼參數)在壓縮效率、視覺質量和所需的計算能力方面存在着不同的權衡。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"怎樣把工作安排得更好,讓每個人的總體體驗最大化,已經成爲人們最關心的問題。Facebook 有專門的編碼計算池和調度器。該方法接受一個附加優先級值的編碼作業請求,將其放到優先級隊列中,高優先級的編碼任務得到優先處理。因此,"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"視頻編碼系統的工作就是給每項任務分配適當的優先級"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"。這可以通過一系列簡單的、硬編碼規則來實現。編碼任務可以根據幾個因素來分配優先級,包括視頻是否爲授權音樂視頻、視頻是否爲產品視頻、視頻所有者有多少朋友或粉絲。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"但是,這個方法有其缺點,隨着新的視頻編解碼器的出現,需要維護和調整的規則越來越多。因爲不同的編解碼器和菜譜有不同的計算要求、視覺質量和壓縮性能的權衡,所以不可能通過一組粗粒度的規則對最終用戶的體驗進行全面優化。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"或許最重要的是,Facebook 的視頻消費模式極不均衡,這意味着 Facebook 視頻的上傳者和主頁在好友或粉絲數量方面存在巨大差異。與迪士尼等大公司的 Facebook 主頁相比,播客的主頁可能只有 200 個粉絲。攝像師可以同時上傳他們的視頻,但是迪士尼的視頻可能會有更多的觀看時間。但是,任何視頻都可以得到病毒式的傳播,即使上傳者只有很少的粉絲。問題在於,不僅要支持受衆最廣的內容創作者,還要支持各種規模的內容創作者,同時還要承認這一現實,即擁有大量的受衆也可能意味着更多的瀏覽量和更長的觀看時間。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"輸入效益 - 成本模型"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"這個新模型仍然採用了一套 quick 的初始 H264 ABR 編碼,以確保所有上傳的視頻能夠儘快得到高質量的編碼。但是,我們改變的是"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"如何計算視頻發佈後的編碼工作優先級"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"效益 - 成本模型是根據以下基本觀察得出的:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"numberedlist","attrs":{"start":null,"normalizeStart":1},"content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"只在第一次編碼時,視頻纔會消耗計算資源。編碼完成後,存儲的編碼可以根據需要多次發送,而不需要額外的計算資源。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":2,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"Facebook 上的所有視頻中,有一小部分(約三分之一)產生了大部分的整體觀看時間。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":3,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"Facebook 數據中心只有有限的支持計算資源的能源。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":4,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"在現有能源有限的情況下,通過將計算密集型的“菜譜”和先進編解碼器應用於最常觀看的視頻,我們可以使每個人的視頻體驗最大化。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/6d\/6d8cc5f9811161981335743a2f55f46e.jpeg","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"以此爲基礎,我們對效益、成本和優先權作出如下定義:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"numberedlist","attrs":{"start":null,"normalizeStart":1},"content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"效益"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" =(固定質量下編碼族的相對壓縮效率)*(有效預測觀看時間)"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":2,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"成本"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" = 族中丟失編碼的歸一化計算成本"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":3,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"優先級"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" = 效益 \/ 成本"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"固定質量編碼族的相對壓縮效率"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":":通過編碼族的壓縮效率來衡量效益。“編碼族”(Encoding family)指的是一組可一起交付的編碼文件。舉例來說,H264 360p、480p、720p 和 1080p 編碼通道構成一個族;而 VP9 360p、480p、720p 和 1080p 則構成了另一個族。在相同視覺質量的情況下,比較不同族間的壓縮效率是一個挑戰。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"爲了理解這一點,我們先來看看我們開發的一個指標,即每 GB 數據包的高質量視頻分鐘數(Minutes of Video at High Quality,MVHQ)。MVHQ 把壓縮效率和互聯網流量補貼的問題直接聯繫在一起:對於 1GB 的數據,我們可以流式傳輸多少分鐘的高質量視頻?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"MVHQ 在數學上可理解爲:"}]},{"type":"katexblock","attrs":{"mathString":"M V H Q=\\frac{1 GB}{\\text { Average (MvhqBitratevid1, MvhqBitratevid2, MvhqBitratevid3,... }}"}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"比如說,我們有一個視頻,用 H264 fast 預置編碼的 MVHQ 爲 153 分鐘,用 H264 slow 預置編碼的 MVHQ 爲 170 分鐘,用 VP9 的 MVHQ 爲 200 分鐘。這就是說,使用 VP9 編碼的視頻,在視覺質量門檻較高時,與 H264 fast 預置相比,使用 1GB 數據可以延長 47 分鐘的觀看時間(200-153)。我們使用 H264 fast 作爲基線來計算這個視頻的效益值。我們將 1.0 分配給 H264 fast,1.1(170\/153)分配給 H264 slow,1.3(200\/153)分配給 VP9。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"實際的 MVHQ 只能在編碼產生後才能計算出來,但是我們需要在編碼產生之前就得到它,所以我們使用歷史數據估算出給定視頻的每個編碼族的 MVHQ。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"有效預測的觀看時間"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":":正如下面所描述的,我們有一個複雜的機器學習模型,它可以預測觀衆在不遠的將來看一段視頻的時間。當我們在視頻級別上獲得預測的觀看時間後,我們就可以估計編碼族在視頻應用的效率。它揭示了一個事實,那就是並非所有 Facebook 用戶都擁有能夠播放更新的編解碼器的最新設備。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"舉例來說,大約 20% 的視頻消費發生在無法播放 VP9 編碼視頻的設備上。所以,如果一個視頻的預測觀看時間是 100 小時,那麼使用廣泛應用的 H264 編解碼器的有效預測觀看時間是 100 小時,而 VP9 編碼的有效預測觀看時間是 80 小時。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"編碼族中缺失編碼的歸一化計算成本"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":":這是我們爲使編碼族可交付所需的邏輯計算週期量。在交付視頻之前,編碼族需要提供一組最低的分辨率。舉例來說,VP9 族至少需要 4 種分辨率才能編碼特定的視頻。但一些編碼需要比另一些編碼更長的時間,這意味着不是所有的視頻分辨率都可以同時提供。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"舉個例子,假設視頻 A 缺少 VP9 族中的所有 4 個通道。通過總結所有 4 個通道的估計 CPU 使用量,我們可以爲四個任務分配相同的歸一化成本。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"如視頻 B 所示,如果我們在 4 個通道中只有 2 條缺失,那麼計算成本就是產生其餘兩個編碼的總和。同樣的成本適用於兩個任務。因爲優先級是效益除以成本,所以當更多通道可用時,任務的優先級就變得更加急迫。編碼通道直到可交付時纔有價值,所以儘快得到完整的通道非常重要。比如說,擁有包含所有 VP9 通道的視頻要比擁有 10 個不完整(因此無法交付)VP9 通道的視頻更有價值。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/e2\/e2a10e13bc931073ee0fcc0bbffd41d4.jpeg","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/b3\/b3bb4592b3074c813353f9050b48d20d.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"通過機器學習預測觀看時間"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"一種新的效益 - 成本模型告訴我們應該如何對某些視頻進行編碼,下一個難題是"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"如何確定哪些視頻應該優先編碼"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"。因此,我們現在使用機器學習來預測哪些視頻將被觀看的次數最多,從而應該優先考慮使用高級編碼。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"這個模型將考慮一些因素來預測視頻在接下來的一小時裏的觀看時間。它通過查看視頻上傳者的好友或粉絲的數量和他們之前上傳的視頻的平均觀看時間,以及視頻本身的元數據,包括視頻的長度、寬度、高度、隱私狀態、帖子類型(直播、故事、觀看等等)、視頻的發佈日期、視頻過去在平臺上的受歡迎程度,來實現這一目的。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"但當將所有這些數據都用於決策時,會遇到一些內在的挑戰:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"觀看時間具有高度的差異性,而且長尾效應非常顯著"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"。即便我們集中精力預測下一個小時的觀看時間,一段視頻的觀看時間也可能從零到 5 萬小時以上,這取決於視頻的內容、上傳者和視頻的隱私設置。這個模型不僅需要能夠判斷視頻是否會受歡迎,而且需要能夠判斷其受歡迎程度。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"下一個小時的觀看時間最好的指標是它之前的觀看時間軌跡"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"。一般而言,視頻的受歡迎程度很不穩定。同一內容創作者上傳的不同視頻,有時會因爲社區對該內容的反應而導致不同的觀看時間。通過對不同特徵的實驗,我們發現,過去的觀看時間軌跡是未來觀看時間的最佳預測指標。在設計模型結構和平衡訓練數據方面,這將帶來兩項技術挑戰:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"新上傳的視頻沒有觀看時間軌跡。一段視頻在 Facebook 上停留得越久,我們就能從它過去的觀看時間中獲得更多信息。也就是說,最能預測的特徵將不適用於新視頻。在數據缺失的情況下,我們希望我們的模型也能很好地發揮作用,因爲系統越早確定將在平臺上流行的視頻,就越有可能提供更高質量的內容。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"熱播視頻有控制訓練數據的趨勢。最受歡迎的視頻模式未必適合所有的視頻。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"觀看時間的性質因視頻類型的不同而不同"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"。故事視頻較短,平均觀看時間比其他視頻短。在流媒體播放過程中或之後的幾個小時裏,"},{"type":"link","attrs":{"href":"https:\/\/engineering.fb.com\/2020\/10\/22\/video-engineering\/live-streaming\/","title":null,"type":null},"content":[{"type":"text","text":"直播流"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"可以獲得大部分觀看時間。同時,點播視頻(VOD)的壽命也是多種多樣的,如果人們後來開始分享這些視頻,那麼在最初上傳之後很長一段時間就可以積累觀看時間。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"機器學習指標的提高未必與產品改進直接相關"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"。RMSE、MAPE 和 Huber Loss 等傳統的迴歸損失函數對離線模型的優化效果良好。但是,建模誤差的降低並不一定會直接導致產品的改進,例如改善用戶體驗、增加觀測時間的覆蓋率或者提高計算效率。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"構建視頻編碼的機器學習模型"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/a5\/a54673ed94102f619b1fb47f77684349.jpeg","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"爲應對這些挑戰,我們決定通過使用觀看時間事件數據堆模型進行訓練。在訓練 \/ 評估中的每一行都表示一個決策點,表示系統必須對它進行預測。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"因爲我們的觀看時間事件數據會在許多方面出現偏離或不平衡的情況,所以我們對我們所關注的維度進行了數據清洗、轉換、桶化和加權採樣。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"此外,由於新上傳的視頻沒有可供參考的觀看時間軌跡,我們決定建立兩種模型,"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"一種用於處理上傳時間請求,另一種用於處理觀看時間請求"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"。視圖 - 時間模型使用了上面提到的三組功能。上傳時間模型可以看到內容創作者上傳的其他視頻的表現,並用過去的觀看時間軌跡代替。當一段視頻在 Facebook 上停留了足夠長的時間,並且有了一些過去的軌跡,我們就把它轉換成使用視圖 - 時間模型。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"在模型開發過程中,我們通過研究"},{"type":"link","attrs":{"href":"https:\/\/en.wikipedia.org\/wiki\/Root-mean-square_deviation","title":null,"type":null},"content":[{"type":"text","text":"均方根誤差"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"(Root Mean Square Error,RMSE)和"},{"type":"link","attrs":{"href":"https:\/\/en.wikipedia.org\/wiki\/Mean_absolute_percentage_error","title":null,"type":null},"content":[{"type":"text","text":"平均絕對百分比誤差"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"(Mean Absolute Percentage Error,MAPE)來選擇最佳發佈候選者。由於 RMSE 對異常值敏感,而 MAPE 對小值敏感,所以我們使用了這兩種指標。觀看時間標籤具有較高的方差,所以我們使用 MAPE 評估流行和中度流行的視頻的表現,而使用 RMSE 評估較少觀看的視頻。同時,我們也關注與不同視頻類型、年齡和受歡迎程度上的泛化能力。因此,我們的評估也總是包含了每一類別的指標。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"MAPE 和 RMSE 是很好的模型選擇總結指標,但不一定能直接反映產品的改進。有時候,當兩個模型的 RMSE 和 MAPE 相似時,我們也會將評估轉化爲分類問題,以瞭解其權衡。例如,如果一個視頻獲得了 1000 分鐘的觀看時間,但模型 A 預測的是 10 分鐘,那麼模型 A 的 MAPE 是 99%。如果模型 B 預測的是 1990 分鐘的觀看時間,那麼模型 B 的 MAPE 將與模型 A 的相同(即 99%),但是模型 B 的預測將會使視頻更有可能具有高質量的編碼。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"同時,我們也對視頻分類進行了評估,因爲我們希望在過度頻繁地使用高級編碼和失去使用這些編碼的好處之間找到一個平衡點。舉例來說,在 10 秒的閾值下,爲了計算模型的假陽性和假陰性率,我們計算出實際視頻觀看時間少於 10 秒且預測時間也少於 10 秒的視頻數量,反之亦然。我們對多個閾值進行了同樣的計算。這一評估方法使我們能夠深入研究該模型在不同受歡迎程度的視頻中的表現,以及它是傾向於推薦過多的編碼工作還是錯失了一些機會。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"新視頻編碼模型的影響"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"這一新模型不僅提高了用戶對新上傳視頻的體驗,而且能夠識別 Facebook 上應該使用更高級編碼的老視頻,併爲它們分配更多計算資源。這會把大部分看問題的時間轉移到高級編碼上,從而減少緩衝時間,而無需額外的計算資源。經過改良的壓縮技術還可以讓 Facebook 上那些"},{"type":"link","attrs":{"href":"https:\/\/engineering.fb.com\/2020\/12\/21\/video-engineering\/rsys\/","title":null,"type":null},"content":[{"type":"text","text":"流量有限"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"的用戶,如"},{"type":"link","attrs":{"href":"https:\/\/engineering.fb.com\/2020\/12\/03\/production-engineering\/supercell-reaching-new-heights-for-wider-connectivity\/","title":null,"type":null},"content":[{"type":"text","text":"新興市場"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"用戶,觀看更多質量更高的視頻。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"更重要的是,當我們引入新的編碼菜譜時,我們不再需要花費很多時間去評估在優先級範圍中將它們分配在哪個位置。相反,該模型根據菜譜的效益和成本值自動分配優先級,從而最大化整體效益吞吐量。舉例來說,我們可以引入一種計算密集型的方法,這種方法只適用於一些極受歡迎的視頻,並且模型能夠識別這種視頻。總而言之,這使得我們能夠繼續投資更新、更高級的編解碼器,爲 Facebook 上的用戶提供最好的視頻體驗。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"作者介紹:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"Taein Kim,Facebook 軟件工程師;Ploy Temiyasathit,Facebook 數據科學家;Haixiong Wang,Facebook 軟件工程師。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"原文鏈接:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"https:\/\/engineering.fb.com\/2021\/04\/05\/video-engineering\/how-facebook-encodes-your-videos\/"}]}]}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章