【融雲視角】沉浸式音頻與通訊技術未來趨勢

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"回顧互聯網發展歷程,從 PC 局域網到移動互聯網,互聯網使用的沉浸感逐步提升,虛擬與現實的距離也逐漸縮小。利用沉浸式音頻與通訊技術未來將會很大程度提升用戶的體驗感,而在虛擬與現實的元宇宙中,對沉浸感、參與度、永續性等方面都有很高的要求,因此將會由許多獨立工具、平臺、基礎設施、協議等來支持其運行。隨着 AR、VR、5G、雲計算等技術成熟度提升,基於沉浸式音頻的通訊技術在元宇宙有望逐步從概念走向現實。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"本文將和業內夥伴一同探索元宇宙技術發展對通訊行業帶來的影響,未來沉浸式音頻的發展趨勢以及通訊技術在 VR、AR、AI 行業的應用。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"元宇宙概念簡述元宇宙(Metaverse)是指打造一個與現實生活平行的、體驗幾乎無差異的虛擬世界。人類可以利用虛擬身份在虛擬世界工作、社交互動、娛樂遊戲,甚至買賣交易。總結出來就是,在元宇宙中,你可以想什麼就有什麼,無邊無際的想象力給予你無限的自由。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Metaverse 元宇宙所創造的獨立於現實世界的虛擬數字第二世界,使用戶能以數字身份自由生活。VR、AR、AI 作爲 Metaverse 的技術基礎將迎來高速增長期。虛擬現實行業 2020 年全球市場規模約爲 900 億元人民幣,預計 2020-2024 年均增長率約爲 54%。據中國信通院預測,2021 年開始全球虛擬設備出貨量將加速,預計 2024 年可達 7500 萬臺。(數據來源:天風證券《Metaverse研究報告》)隨着 VR 產業鏈的逐步完善,VR 對行業的賦能會展現出強大的飛輪效應。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"那麼我們怎麼樣才能從現實世界,逐漸進入到元宇宙世界中去呢?","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"真實感的維度如果把元宇宙場景中,用戶體驗到的真實感劃分爲兩個維度:“沉浸感”和“自由度”。兩個軸的起點,則是原生感知現實,例如正在閱讀這篇文章的你。沉浸和自由的深度,共同決定了元宇宙中的用戶體驗是否足夠真實。","attrs":{}}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/8f/8fe90aa4f89d065c653ab542dd10c70e.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"真實感的等級","attrs":{}}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/65/653fd9b06414a4d35be65de1e61c644e.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Lv1:從原生感知初步向虛擬世界邁進的階段Lv2:讓大腦感覺部分真實的虛擬世界Lv3:完全騙過大腦的全真虛擬世界Max:和原生世界深度相同的虛擬世界","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"元宇宙現階段發展趨勢現階段元宇宙概念的產業鏈,例如互動體驗、人機交互等,大部分能力範圍在 Lv1-Lv2 之間,僅有少部分尖端企業向 Lv3 邁進。未來階段如何實現 Max 的目標,是否能真正實現,目前還無法得知。","attrs":{}}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/b8/b84670a99333f448761f5ab56d6913b0.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Lv1-Lv2 範圍的產業鏈已日漸成熟,目前已經實現 3D 體感電影、開放沙盒遊戲、VR、AR、MR 遊戲等應用。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"如果說 Lv2 階段的用戶體驗,是由某幾個沉浸或自由因素堆積而成的“半真實”體驗,那麼升級到 Lv3 階段的“全真實”體驗,可以說是質的飛躍。“沉浸”和\"自由\"必須做到足夠的深度,相輔相成。數字化的視覺和聽覺感知體驗是否可以完全騙過我們的大腦?3D 引擎是否能提供足夠的自由體驗?AI 是否能做到永續性、自生長?網絡傳輸是否可實現無延遲?只要任何一個因素存在缺陷,就不可能真正實現“全真實”的用戶體驗。可見從“半真實”到“全真實”,實現難度會陡增。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"到 Lv3 之後,元宇宙下一個階段,就是實現終極目標,讓人們的意識永生在虛擬世界。影響這一目標實現的因素,除硬件、軟件、通訊等科技因素之外,還涉及到生物學和醫學範疇。是否能真正實現,目前來看仍是未知。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"頭部廠商的進展","attrs":{}}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/fe/fec20c33cb10cf3a3247602de82bff59.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"1.Facebook2020 年 9 月,Facebook Connect 2020 大會上,Facebook 發佈了 AR/VR 十五大重要戰略規劃。會上公佈的一系列 AR/VR 信息,涵蓋最新硬件產品、軟件產品、解決方案、開發者服務、前沿技術研究等。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"其中 VR 頭顯 Oculus Quest 2 依靠平臺提供的遊戲和軟件支持,已經成爲目前市場上主流的 VR 頭部穿戴設備。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"值得關注的是,在會上發佈的Project Aria 是Facebook 構建的幫助研究人員理解 AR 眼鏡所需軟件和硬件的研究設備。它使用傳感器能從佩戴者角度捕捉視頻和音頻,通過GPS 計算位置,捕捉多聲道音頻。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"2.Apple美國知名科技博客 Scobleizer 預測,蘋果在未來一年內公佈的產品計劃中,將會包含一款全新的 AR/VR 頭顯。具體來講,蘋果計劃在未來十年推出多款產品,包括 AR/VR 眼鏡、AR/VR 隱形眼鏡(分別在 2022 年到 2025 年之間推出)。這意味着蘋果要從 2D 屏幕、界面和體驗向 3D 形式升級。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Scobleizer 表示:蘋果 AR/VR 頭顯將同時覆蓋使用者的雙眼和雙耳,戴上之後你不僅看不到周圍的環境,也聽不到周圍的聲音。也就是說,蘋果 AR/VR 頭顯的一大特點是視覺和聽覺的沉浸感,有趣的是,它並不會將使用者與外界完全隔絕,也許可以通過 AR 透視功能讓你看到和聽到周圍。在蘋果 AR/VR 頭顯開機之後,你才能看到周圍環境的虛擬影像,並且聽到周圍的聲音。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"值得關注的還有蘋果車載環繞音頻技術。Scobleizer 表示,該技術可以從汽車內部、家裏等各種地方營造環繞式聲音效果。利用蘋果 AR/VR 頭顯的 LiDAR 模組,可以實現 3D 音頻在空間中的定位。通過親自體驗,他表示該技術可模擬親臨現場的音頻效果。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"RTC 通訊技術的現狀RTC 的音頻傳輸技術,是通過採樣、量化、編碼、壓縮,實現模擬信號到數字信號的傳輸。目前常用的是雙聲道的採樣,即左右兩個聲道的立體聲,再經過壓縮處理,傳輸時佔用帶寬少,符合目前大部分業務場景對傳輸效率的需要。隨着5G 到來,網絡帶寬不再是問題,在保證傳輸效率的基礎上,人們會進而追求 3D 沉浸式的音頻體驗。雙聲道採樣將不再符合未來需求。多聲道採集(例如 Ambisonics 麥克風用四面體陣列形式採集 4 個聲道)傳輸,或許成爲未來通訊技術的主流。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"除了上述辦法使用戶實現沉浸式的音頻體驗,還有沒有其他方法?我們先來看一下,目前成熟的沉浸式的音頻技術有哪些。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"沉浸式的音頻技術目前,沉浸式音頻類型主要分爲三大類:基於聲道 Channel based audio (CBA)、基於對象 Object based audio (OBA)、基於場景 Scene based Audio (SBA)。Scene-Based Audio 主要是用來描述場景的聲場,其核心的底層算法是 HigherOrder Ambisonic(HOA)。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"根據業內專家分析的結論,未來 VR 音頻專業領域則主要是 Object based audio 和 Ambisonics(HOA)兩大趨勢。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"那麼 VR 音頻技術,可以應用在哪些 VR 社交場景中呢?","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"與社交場景的對應在元宇宙發展的現階段,社交場景主要存在於 VR 遊戲、VR 直播、和 VR 社交軟件上。","attrs":{}}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/6b/6b35e2ead20029ede2953e232639011c.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"因爲 Object based audio 有大量的數據和運算,除了聲道的音頻外,還有關於聲源的metadata (元數據):聲源(位置、大小、速度、形狀等屬性)、聲源所在的環境(reverb (混響)和 reflection(回聲)、attenuate (衰減)、幾何形態),所以它更適合用於 VR 主機上的遊戲。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Ambisonics 的特點是聲源貼在提前渲染好的全景球上,所以玩家不一定能夠將聲源放在場景中想放的位置,即使有聲源也被壓縮在了這個球上。它適合移動端和流媒體視頻。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"如何利用沉浸式音頻與通訊技術提升未來體驗通過以上分析,我們怎樣利用 RTC 的音頻傳輸技術實現用戶沉浸式的音頻體驗呢?","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"1.直接傳輸沉浸格式的音頻使用 Ambisonics 技術,聲音的採集和處理都交給 App 或者 VR 聲音引擎,RTC 通道僅負責進行傳輸。","attrs":{}}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/8f/8f23e14f60df4238ed270bdb826d5637.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"2.預處理後交給接收端還原對應 Object based audio 技術,聲音的採集用 Ambisonics,但是在傳輸之前,降維到雙聲道進行編碼和傳輸,這樣 Web 端或移動設備能兼容。然後接收端通過雙聲道數據,再還原回 Ambisonics,根據虛擬場景的變化實時渲染,最後在用戶端播放。","attrs":{}}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/84/8493d8bb70782db1e9d7247533474e39.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"3.通過文字與語音的轉換技術實現如果虛擬場景中是二次元的世界,我們不僅要避免人聲的直接還原,還要讓人物語音符合二次元世界中的設定。對於這種情況,可以藉助融雲 IM 技術,以及語音和文字的互轉實現(asr 和 tts)。人聲採集後先轉成文字,再輸入到聲音建模中,最後轉成二次元人物的聲音。這種方法可以讓每一個玩家的語音都符合遊戲世界中的設定,從而增強沉浸感。","attrs":{}}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/97/979bb28e2f7d7967aa543dcbacafef61.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"結束語","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"相關技術的不斷升級進步,會使元宇宙概念繼續向前發展。VR、AR、5G、AI、專業引擎和平臺等產業鏈的發展,也會繼續帶動用戶對沉浸式體驗的追求。沉浸式音頻通訊有可能會成爲未來通訊的主流。我們對市場保持關注,希望和業內夥伴一同深入探索研究,沉浸式音頻與通訊技術或可成爲通訊業務未來的突破口。","attrs":{}}]}]}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章