全面解析騰訊會議的視頻前處理算法

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"一、視頻前處理場景探索"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"視頻是連續的,在轉播的時候需要經過編碼和解碼的流程,所以視頻處理需要分爲前處理和後處理。所謂前處理就是指編碼前的視頻處理,比如背景虛化。所謂後處理就是指解碼後的視頻處理,比如視頻超分。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"有哪些前處理算法可以應用在視頻會議的處理場景下呢?理想情況下,多多益善,能夠想到的都可以落地,但是考慮到會議場景的計算資源非常有限,而且要不影響其它高優先級的服務,所以需要挖掘用戶最迫切的需求,利用有限的計算資源爲用戶提供更好的視頻體驗。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"數據分析發現會議場景下大家開攝像頭的比例不是很高,我們分析主要有三個原因:第一擔心泄漏隱私,第二不夠自信,第三畫質不好。針對這幾個點騰訊會議陸續推出了虛擬背景、美顏、視頻降噪、暗場景增強等一系列的處理算法。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"虛擬背景可以很好的保護用戶隱私,創造一個公平的環境,這裏貼了一個用戶的反饋,這是一個在線課堂老師反饋虛擬背景可以爲許多孩子取消歧視,讓家庭背景、家庭條件不再成爲孩子的負擔。美顏的話,相信大家都是非常瞭解,也是經常用的,它可以鼓勵大家參與到視頻通話的場景中來。視頻降噪可以降低攝像頭的噪聲,消除燈光造成閃爍的問題,進而提升視頻畫面的質量。暗場景增強可以提升暗光場景下的視頻體驗。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/ed\/eddc7bd2831788d94e0f276a3623d7bd.png","alt":"Image","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"二、虛擬背景的算法探索與實踐"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"所謂虛擬背景是指允許用戶在使用騰訊會議期間上傳自定義的圖片或者視頻,作爲視頻場景下的虛擬背景或者將視頻背景模糊掉,滿足用戶保護隱私和個性化視頻的需求。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"虛擬背景的框架主要包括數據、模型、損失、訓練和前向推理引擎五大模塊。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/67\/67a91f7a7dace762ba9d0b30ca1652b4.webp","alt":"Image","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"對於深度學習任務,大家都知道數據的數量和質量是效果的關鍵。由於騰訊會議中的數據非常敏感,涉及非常多的隱私,我們拿不到用戶使用時的真實數據,所以騰訊會議採取了自採和精細標註兩種方案,目前數據流程是一種閉環式的迭代優化。訓練過程中數據也會做一系列的增廣,比如說顏色變換、隨機噪聲、隨機模糊等等來增加數據的多樣性。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"從模型層面,利用編碼器得到輸入圖像的多層特徵表示,其中第一層分辨率比較高,編碼了人像的邊緣細節信息,而高層特徵空間分辨率比較低,只能編碼人像的抽象語義信息。如果我們將高層特徵表示送到解碼器裏進行融合學習,解碼過程中分辨率又會逐級回升。騰訊會議還會在解碼之後接一個輕量級的調優模塊,這樣就可以在高分辨率上恢復更多的細節。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"網絡輸出後還要經過一系列的膨脹、腐蝕、邊緣、羽化等等多項後處理算法的優化。損失函數也是訓練流程的關鍵,它決定了這個網絡能夠聚焦在哪一些目標上進行學習,騰訊會議目前採用了多損失約束的方式來指導網絡的學習。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"交叉熵損失是對每個象素點進行約束,分割任務中的一項基礎損失。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"爲了增加時域的平滑性,要約束網絡對輸入經過輕微干擾和原輸入的距離,這樣可以在一定程度上模擬實際運行中前後幀之間的穩定性和連續性。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"邊緣準確性對於人像分割的直觀體驗影響非常大,所以騰訊會議還利用了一個單獨的分支和邊緣損失來提升網絡邊緣處理的準確性。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"虛擬背景是跑在各個端上,而不是服務器上,這就對神經網絡的高效、輕量提出非常高要求,這也使得網絡的深度和寬度非常受限,學習能力相比服務器端的模型有較大程度的下降。爲了彌補網絡帶來的準確率的下降,騰訊會議採用了多種蒸餾方式,約束線上的小模型與服務器之間大模型的距離,使得小模型與服務器模型之間的輸出分佈儘量靠近。這種多蒸餾損失可以提升蒸餾的有效性,從而提升線上小模型的準確率與融貫性。實際訓練的過程中,騰訊會議還採用了分佈式等可以快速提升訓練速度的多種優化的方式,並且支持在線和離線兩種蒸餾方式。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"最後是前向推理引擎,它可以對算法模型進行鍼對性的異構和並行計算優化,從而達到性能和效果之間良好的平衡,是算法最終能夠落地的一個關鍵技術。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"1. 數據"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"數據流程閉環是怎麼形成的?具體來講,首先是建立一個自採池,存放自採的數據。需要注意的是,由於用戶數據的敏感性,騰訊會議拿不到任何線上實際的數據,所以只能模擬線上會議的場景來進行採集,比如大家一起進行自採,如果說這時候有用戶反饋的話,會根據用戶反饋的文字或者圖片、或者視頻的描述,來增加模擬佈置類似的場景進行採集補充。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"對於池子裏面的數據會不斷進行標註,並且把標註的數據提交到標註池,然後將標註後的數據提供給線上模型進行迭代訓練。對於線上模型,會將自採的數據抽樣進行自測和壓測,將分割不達標的圖片添加到自採池和badcase池。並且我們會對badcase池裏的所有badcase通過分析特點進行存量分佈歸類,對歸類的數據再進行特定的補充採集。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"這就是整個數據流程的閉環式迭代優化,它有效推動了線上模型質量的持續提升。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/de\/dedb0d41d75fd04b0766a130882c9402.png","alt":"Image","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"2. 模型 "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"    "},{"type":"text","marks":[{"type":"strong"}],"text":"騰訊會議目前採用了經典的編碼、解碼、refine的結構"},{"type":"text","text":"。編碼器會不斷地降低分辨率並逐級抽象,而解碼模塊則是對多級特徵進行融合學習,實現分辨率回升。同時騰訊會議還採用了多任務約束,下面的圖中可以看到實驗結果,經過多損失約束限制,輸出的圖會更加準確,邊緣的一致性也得到了提升。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/d7\/d73b69298c0f1e143d72f256b7c5adcd.png","alt":"Image","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"3. 損失"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"前面提到爲了彌補網絡帶來的準確率的下降,騰訊會議採用了多種蒸餾方式約束線上的小模型與服務器之間大模型的距離。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"爲什麼要做蒸餾?蒸餾作爲一種只增加訓練耗時而不增加線上測試耗時的方法,在損失約束中被大量採用,對於線上超輕量的小模型來說,蒸餾可以比較好的縮小服務器、教師網絡和學生網絡之間的差距。並且在實際操作過程中發現經過蒸餾後的學生網絡能夠比蒸餾前的網絡具有更好的穩定性。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"目前騰訊會議支持離線蒸餾和在線蒸餾兩種方式。其中離線蒸餾是單獨訓練老師網絡,然後固定老師網絡的參數,通過老師網絡指導學生網絡進行學習。在線蒸餾是教師網絡帶着學生網絡一起學習,這樣對網絡而言既包括了教師網絡的自我學習,還有同步指導學生模型的蒸餾學習,也包括學生網絡的自我學習。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/ca\/cad7dd7218a9b27607da33b3837a5a07.png","alt":"Image","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"常用的蒸餾損失包括下面圖中列出來的三種。Pixelwise蒸餾是象素級的,Affinity蒸餾是臨近關係的方式,Attention蒸餾是注意力的方式。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"象素級的蒸餾主要作用在輸出層,也就是說利用教師網絡的輸出作爲軟標籤約束學生網絡的輸出。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"基於Affinity矩陣的蒸餾的時候,教師網絡和學生網絡分別計算各自的Affinity矩陣,這種能夠建模象素間的關係,然後該領域的信息知識從教師網絡就遷移到了學生網絡裏面。雖然信息遷移能有效補充單象素點獨立建模信息不足的問題,但是計算量相對比較大。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"騰訊會議目前採用的是第三種,注意力蒸餾的方式,它是一種高效的約束中間特徵層的蒸餾方法。從圖中可以看出,經過建模空間注意力特徵圖從三維直接降到了二維,之後再進行約束,這樣可以大幅降低計算量,提升蒸餾的效率。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/27\/27ed91d1ccebf187ff5cefb9231d8b4b.png","alt":"Image","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"4. 前向推理引擎"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"深度學習算法落地的過程中,前向推理加速是非常重要的一環。前向推理簡單來說,如圖所示,以輸入數據爲出發點,經過一層層的神經網絡得到最終的輸出數據。層之間的數據通常是用tensor表示,每個tensor就表示了一個多維矩陣。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"騰訊會議自研推理引擎主要有三個方面的優勢:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"第一,音頻是騰訊會議的核心能力之一,自研推理引擎在音頻算法性能優化方向積累豐富,並已經上線了量化和量化訓練的支持;"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"第二,桌面端是騰訊會議的核心場景之一,自研推理引擎在桌面端的異構計算和並行計算優化方向積累豐富,有力的支撐了桌面端的音視頻算法。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"第三,自研推理引擎可以快速定製化和響應業務的需求,特別是對於騰訊會議這種快速迭代的產品來說,快速定製化和迭代的能力非常重要。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"目前騰訊會議前向推理框架主要分爲三層:引擎層、邏輯層、接口層,從下到上。其中引擎層主要負責對各種神經網絡算子進行計算優化,而邏輯層封裝了與各個業務相關的邏輯,接口層是提供了跨平臺的業務接口。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/c8\/c8b99c46638424623de6dcdc4caf9432.png","alt":"Image","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"下面列了一些目前騰訊會議前向推理引擎支持的加速方案,由於不同的硬件或者處理器、設計理念不同,應用場景也不同,因此不同的方法在不同的處理器上會有不同的性能瓶頸,所以需要具體問題具體分析。舉個例子,通常來說CPU適合處理邏輯比較複雜的計算,而GPU適合處理數據並行的計算密集型運算。而且還需要考慮在不同處理器之間的數據交互可能給性能帶來的影響,這個消耗有時候會成爲整體算法性能的一個瓶頸。以虛擬背景爲例,視頻人像分割深度學習模型適合運行在GPU上,而後處理的一些算法就更適合運行在CPU上。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/d0\/d036d2bd09841ca48b1c097b0c6d6f58.png","alt":"Image","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"三、美顏功能的算法探索與實踐"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"大家經常用美顏。它可以提升用戶的視頻體驗,鼓勵用戶參與到語視頻通話的場景中來。從技術的角度來說,美顏通常包括磨皮、美形和美妝三個部分。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"看一下美顏整體解決方案的框架,它包括三個部分:預處理模塊、全局美化模塊、局部調整模塊。輸入圖像首先進入預處理模塊,通過全局降噪圖像處理提升圖像質量,在實際應用中如果檢測到人臉就可以進行全局美化或者局部美化,否則就直接輸出降噪後的圖像。全局美化模塊實現圖片整體美化效果,比如全局磨皮、非皮膚區域的潤化色調調節,經過全局美化後的圖片是可以具備比較好的整體視覺,是可以直接輸出的。當然,也可以進一步進行局部調整,在這裏是基於人臉關鍵點配準設計的一套算法,主要是實現五官立體、化妝等局部內容調整,最後是將調整後的圖片輸出。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/83\/8321ca2141459c57cc1b960b6bcab5cc.png","alt":"Image","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"四、Q&A"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在不斷優化美顏解決方案的實踐過程中,我們會遇到一些困難,比如:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"Q:如何在性能和效果之間取得平衡?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"A:目前騰訊會議上線的是全局美化,還是因爲性能問題要做出取捨。我們會有更多的針對不同機型採用不同的美顏策略,比如中低端的機型採用磨皮美白,高端機型可以放開美形、美妝這樣的能力。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"Q:強噪聲的攝像頭數據下如何保證美顏體驗?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"A:在進行美顏磨皮的時候需要考慮三個因素,一個是光滑皮膚的區域,二是保留人臉五官的細節,三是抑制噪聲。目前騰訊會議是在圖像預處理的時候會採用降噪的方式先對整體的圖像進行整體預處理。要注意的是,要避免使用銳化的細節強化的處理,那樣有可能導致噪聲更加凸顯。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"Q:爲什麼人臉檢測和人臉配準要分開設計?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"A:一方面是基於性能的考慮,沒有人臉的時候考慮降噪和增強的基本操作,有人臉的時候才需要進行美化處理。另外一方面是模型的穩定性,直接對圖像進行人臉配準往往包含大量背景冗餘信息,會造成關鍵點配準難度加大而且不穩定,如果經過人臉檢測之後,只對人臉區域進行人臉配準的話,通常結果是比較穩定的。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"Q:美顏策略如何適配巨大的分辨率跨度?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"A:在視頻會議裏面分辨率跨度大,比如網絡的因素就會使得圖像分辨率會在一定的範圍之間波動。這裏騰訊會議主要分了三種場景:第一種就是在分辨率很低的時候,會直接放棄磨皮,因爲這時候整個畫面比較糊,再使用磨皮往往導致整個畫面更糊。第二是在中段的分辨率裏面,那麼採用一種叫做上採樣然後進行美化之後再進行下采樣的方式,這樣會有優化效果,但是會增加一定的計算開銷。第三是高分辨率,那就可以直接使用前面講的那些美顏算法。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"Q:應用場景中如何精簡美顏處理複雜度?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"A:如果前面背景虛化已經把人像分割出來了,那麼這時候其實不需要浪費計算資源對背景進行磨皮的操作,類似這種都是可以降低美顏處理複雜度的方式。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"horizontalrule"},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"頭圖:Unsplash"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"作者:李峯"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"原文:https:\/\/mp.weixin.qq.com\/s\/QG_qFpWwHqlXVqoaHR7efQ"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"原文:騰訊技術開放日 | 全面解析騰訊會議的視頻前處理算法"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"來源:騰訊多媒體實驗室 - 微信公衆號 [ID:TencentAVLab]"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"轉載:著作權歸作者所有。商業轉載請聯繫作者獲得授權,非商業轉載請註明出處。"}]}]}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章