提升RTC音頻體驗 - 從搞懂硬件開始

{"type":"doc","content":[{"type":"heading","attrs":{"align":null,"level":1},"content":[{"type":"text","text":"前言","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"RTC(實時音視頻通信)技術的快速發展,助力了直播、短視頻等互動娛樂形式的普及;在全球疫情持續蔓延的態勢下,雲會議需求呈現爆發式增長,進一步推動了RTC行業的快速發展。爲了給客戶提供穩定可靠的服務,網絡系統方面需要不斷提升頻道連通率,降低會議過程中的斷流率,增強抗弱網能力;視頻方面需要提升視頻清晰度,降低視頻卡頓率等,音頻方面在追求端到端MOS的同時,也要重點關注音頻3A算法的效果,這些都是各廠家必須修煉的“內功”,也是最終沉澱下來的核心競爭力。本文將重點闡述硬件設備採集的音頻質量對RTC端到端音頻體驗的重要性。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/4b/4b64020cf464565a5b22fee2fc0bf58c.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"italic","attrs":{}}],"text":"圖片來源於網絡","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":1},"content":[{"type":"text","text":"採集質量不佳,會有什麼影響?","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在RTC架構中,端到端的音頻信號處理流程大致如下圖,上行分別經過了音頻信號的採集,音頻3A(AEC:回聲消除、ANS:自適應降噪和AGC:自動增益控制)和編碼;下行分別經過丟包恢復,解碼,混音和播放。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/ef/ef3f189e4bfe72d79099aad4294cf12a.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"italic","attrs":{}}],"text":"端到端的音頻信號處理流程","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"不難看出,音頻信號經過模數轉換,再經過設備集成的音頻信號處理芯片,最後才傳遞給RTC SDK。由於硬件廠商的不同,音頻採集解決方案參差不齊,因此採集到的音頻質量的好壞直接影響着3A算法拿到的生產資料的可用性,同時也決定這最終用戶接收到音頻信號質量的上限。根據實際工作中遇到的音頻問題,因爲設備採集引起的問題基本可以歸納爲如下幾類:","attrs":{}}]},{"type":"embedcomp","attrs":{"type":"table","data":{"content":"

採集問題

現象與影響

無回調,音頻異常

無聲;直接影響可用性

音頻異常

不可聽;直接影響可用性

抖動

採集到播放延時抖動,引發回聲;嚴重的會導致語義缺失,影響交流;

音量過小

聲音過低,sdk如果數字增益不足,都是對端聽感上比較喫力;

音量過大,爆音

音量巨大刺耳,影響聽感體驗;回採非線性失真較重,影響回聲消除效果;

頻譜缺失

無法滿足高音質需求

"}}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/25/2567044f44396076ce37d52800d0fa6a.png","alt":null,"title":"","style":[{"key":"width","value":"100%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"舉幾個例子:","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"(1)採集異常","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"採集異常主要體現在頻譜“模糊”,嚴重的會導致無法聽懂語義,影響正常交流。如下語譜圖。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/6a/6a368b967727f97d682a55f722e44847.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"另外,採集異常後,播放的信號被麥克風採集後也會表現出異常,從而引起嚴重的非線性失真,影響回聲消除效果,如下圖。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/e3/e3af83425334b8bb9daaf91c7fc949d8.png","alt":"image.png","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"(2) 採集抖動","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"常見的就是採集丟數據,聽感上會聽到有很多高頻噪點(下圖爲上圖中噪點放大後的局部圖),嚴重的會影響AEC算法中對延時估計準確性和遠近端非因果問題,嚴重的會導致漏回聲。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/78/78cec19e53246f405bc526160f9c191b.png","alt":"image.png","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/9c/9c78ec0f5da24f9e01a9d8f2c5635260.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"(3)爆音和音量小問題","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"採集爆音問題主要發生在PC,也是PC端設備最應該避免的問題,影響較大,除了截頂導致的頻譜失真之外,嚴重的非線性失真會影響回聲消除效果。爆音問題需要AGC算法通過自適應調節PC端模擬增益以及麥克風加強解決。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/0a/0a0d1c3410e7c30204eb73c40492cb15.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"(4)頻譜缺失","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"頻譜缺失主要是硬件回調的音頻採樣率與實際的頻譜分佈不一致,即使編碼器給到很高的編碼碼率,聽感上也沒有高音質的效果,如下圖,採集信號採樣率爲48kHz,但是頻譜上限卻只有8k。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/94/94a2bb9fdf6d7e53c2f5b5d78e5f25eb.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":1},"content":[{"type":"text","text":"改善採集音質,硬件層面我們能做什麼?","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"具備RTC能力的硬件設備早已滲透我們生活的方方面面,常見的如移動端手機和PC,現在甚至連兒童電話手錶,天貓精靈以及各種高端的指紋密碼鎖等設備都支持了RTC。然而,設備的多樣性直接決定這採集能力的差異性,拋開聲學元器件設計差異這一因素,就Android端而言,芯片和軟件系統的差異使得同一品牌的手機,也沒辦法用同一種配置適配所有型號的手機。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"另外,現在絕大多數的移動端設備都自帶硬件音頻信號處理(後稱硬件3A)能力,不同芯片效果方面也是千差萬別的同時,更嚴重的是經過硬件處理的音頻信號頻譜往往會有缺失,如開啓硬件3A後回調到RTC SDK的音頻信號頻譜上限僅支持到8k,相當於16kHz採樣的音頻信號,尤其在娛樂方面根本無法滿足我們對高音質的追求。因此,做好硬件層的適配工作,是保障RTC高質量音頻體驗的基礎。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"Android端","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"(1)需要搞清楚javaaudioclass和opensles這兩種模式的差異,以及各自需要適配的參數,掌握關閉硬件3A的配置。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"(2)採集抖動或音頻音量異常,可以試試更改請求的採樣率,通常設置的48k採樣不會適用於所有的android設備。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"Windows端","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"(1)當前很多Windows設備會在屏幕頂端內置麥克風陣列,提供音頻增強功能,開啓方式如下圖。這個功能默認屏幕正前方夾角區域爲拾音區域,通過麥克風陣列技術可以有效的增強拾音區域內發言人語音,“隔離”拾音區域以外的“噪聲”,其主要的弊端就在於開啓此功能後僅支持8k頻譜,且各廠家增強算法存在差異,效果也參差不齊。因此,軟件需要具備能夠bypass硬件自帶音頻增強功能的能力,爲高音質做保障。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/de/de7f9eaf163f290e5175cd96dfedbcb1.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"italic","attrs":{}}],"text":"Windows設備自帶的雙麥陣列(圖片來源於網絡)","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/ae/ae4b064dbe0379d4cd6d4b6b16cc2041.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"italic","attrs":{}}],"text":"音頻設置中的增強功能開關","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/d9/d9b45bfa78d4fb961eb66cdc3f381980.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"italic","attrs":{}}],"text":"開啓音頻增強後,帶來的頻譜缺失","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"(2)音量方面,PC端設備都支持模擬增益調節,大多數帶有陣列的Windows設備都有額外的麥克風加強(如下圖)。軟件算法層面(3A中的AGC)需要具備自適應調節他們的能力,保障音頻採集音量的平穩以控制採集底噪水平。初值設置或自適應調節不當都會導致音量小和爆音等問題,嚴重的會影響回聲消除和降噪的效果,帶來影響可用性的風險。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/55/55d527edf97309f27ed61270e1c18c1e.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"italic","attrs":{}}],"text":"模擬增益與麥克風加強","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"蘋果設備","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"(1)ios端適配工作較少,需要熟悉關閉硬件3A的配置,因爲ios設備自帶的硬件3A頻譜也只能支持到10k-12k。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"(2)Mac筆記本設備比較簡單,僅提供了模擬增益調節。但是有一點需要注意,RTC在支持雙聲道播放時,由於麥克風會與某個揚聲器在同一側,導致播放音頻時附近的麥克風採集爆音問題,一般只能優化軟件AEC算法解決。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":1},"content":[{"type":"text","text":"總結","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"當48k高音質成了剛需,爲了保障採集環節的高質量,一方面需要投入時間去掌握Android參數適配的規律,同時市面上出現的越來越多的定製化的android設備(手錶,智能音箱等),也必不可少的需要先確定好配置參數;另一方面關閉硬件設備自帶的音頻處理功能,啓用RTC自帶的純軟3A算法也是一種趨勢,前提是要優化好軟件3A算法整體效果以及控制好功耗,這也是客戶評測各廠家之間音頻體驗的必測項,也是各廠家的核心競爭力之一。","attrs":{}}]}]}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章