開源十年,WebRTC 的現狀與未來

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"italic"},{"type":"strong"}],"text":"本文首發於 InfoQ,由聲網Agora 開發者社區 與 InfoQ 聯合策劃,並由 InfoQ 審校。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"WebRTC 在今年 1 月被 W3C 和 IETF 發佈爲正式標準。從開源至今,十年的時間,傾注了衆多開發者的貢獻。本文由 Google WebRTC 產品經理 Huib Kleinhout 基於在由聲網舉辦的 RTE 大會上的分享彙總整理,並增加了其近期對於 WebRTC 前景的看法。"}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"2020年,WebRTC發生了很多變化。WebRTC其實就是一個客戶端庫。大家都知道它是開源的。儘管 Google 大力地在支持 WebRTC,但社區的力量同樣功不可沒。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"WebRTC 對於桌面平臺、瀏覽器端實現音視頻交互來講十分重要。因爲在你可以再瀏覽器上運行任何一種服務,並進行安全檢查,無需安裝任何應用。這是此前開發者使用該開源庫的主要方式。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"但 2020 年,瀏覽器的發展方向變了。首先講講 Microsoft,它將自研的瀏覽器引擎替換爲基於 Chromium 的引擎,同時它們也成爲了 WebRTC 的積極貢獻者。Microsoft 貢獻之一是 perfect negotiation,它使得兩端以更穩妥的形式協商。而且,它們還改良了屏幕捕獲,使其效率更高。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"另一方面,還有 Safari。蘋果的 Safari 還在繼續改進他們 WebRTC API。激動人心的是,最新一版的 Safari Tech Preview 中已支持了 VP9,而且還支持硬件加速,大家可以在 Safari 的“開發者設置”中啓用它。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"火狐瀏覽器增加了重傳以及 transport-cc,這有助於更好地估計可用帶寬,從而改善媒體質量。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"另一方面,Project Zero——Google負責產品安全性的團隊,通過尋找漏洞,幫助提高 WebRTC 的安全性。這意味着如果你的庫不基於瀏覽器,及時更新WebRTC庫、遵守說明就更加重要了。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"另一件激動人心的事情就是,2020 年,雲遊戲已經上線了。它的實現有賴於 WebRTC。 Stadia(Google 的雲遊戲平臺)已於 2019 年底推出,但 2020 年初才正式在瀏覽器得以支持。其雲遊戲搭載 VP9,提供 4k、HDR 圖像和環繞聲體驗。這些都會通過 WebRTC 進行傳輸。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"數月前,幾個月前,NVIDIA 也發佈了適用於 Chromebook 的 GeForce Now,同樣使用了 WebRTC。最近,Microsoft 和亞馬遜也宣佈支持基於瀏覽器的雲遊戲開發。 這確實促使WebRTC從數百毫秒延遲降低到了數十毫秒延遲,同時開啓了全新的應用場景。 但最重要的是, 2020年,實時通訊(RTC)對於每個人來說都是必不可少的一部分。 因此,許多網絡服務的使用率暴漲,漲幅從十倍到幾百倍不等。 大家打語音電話的次數更多了,時間更久了,羣組數量和成員人數也增加了, 線上交流越來越多。 所以我們需要更豐富的互動方式。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"從Google的角度來看, 在疫情爆發的頭2、3個月內,我們的最大需求容量增長了30倍。所以即使是Google,要確保後端和所有系統功能都可以應對這麼大的增長,我們也付出了很多努力。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在變化面前, WebRTC和實時通信使用量激增。 大衆的日常習慣也在變化。 現在不只在公司能工作, 自己的臥室、廚房裏都是工作場所了。由於“社交距離”,面對面交流變得不再現實,我們需要其它與他人社交的方法。我們只能通過視頻,依據別人的表情猜測他的意圖,此時高清的視頻質量就顯得更加重要了。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"每個人協作的方式不同,可能是因爲我們用的設備不一樣。 如果你在公司, 你用的可能是臺式機,那你可能會用它在會議室裏開會。 而下班之後,你可能會帶你的筆記本電腦回家。 但現在人們都在用筆記本處理各種事宜,比如同時運行應用、視頻會議和文字聊天。 這種場景下,電腦的使用率非常高。我們看到學校裏的孩子們也在用筆記本電腦,比如Chromebook, 但他們電腦的性能相對差一點。社交、學習線上化之後,電腦的任務處理量突然增大, 所以開展該WebRTC項目的意義在於我們需要幫助擴展WebRTC,確保其運行良好。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"其次,我們需要爲Web 開發者和實時通訊開發者提供更大的靈活度,讓他們可以在當下開發出新的互動體驗。當疫情爆發時,它阻礙我們了在Chrome中開展的所有實驗,於是我們所做的一件事情就是專注於服務的擴展、維護。 但這遠遠不夠,特別是在提高性能方面,我們需要做得更好。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"大家可以猜一猜,如果你要在任何使用WebRTC的瀏覽器中開展實時服務, 最耗性能的部分會是什麼呢?是視頻編碼?音頻編碼?網絡適配?(因爲你會考慮到可能會有丟包和網絡變化)又或者是渲染?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"當你想在屏幕顯示攝像頭採集的畫面時,我們可以來看看瀏覽器中發生了什麼。 我們假設你有一個通過USB驅動程序輸入的攝像頭, 驅動運行,開始處理,攝像頭可能還會進行人臉檢測、亮度調整等操作。 這要經過瀏覽器內的系統,Chrome 和其它瀏覽器都是多進程的。多進程有助於瀏覽器的穩定性和安全性,比如一個組件或一個頁面崩潰,或存在安全漏洞,那麼它就會與其他沙盒中的組件隔離。 但這也意味着進程間有大量的通信。 所以如果你有一幀視頻數據從攝像頭被採集,它可能是MJPEG格式。 當它開始渲染你定義媒體流的頁面時, 格式可能爲I420。 當從渲染進程轉到GPU進程(需要實際在屏幕上繪製)時,需要提供最高質量的數據,此時數據可能是 RGB 格式。 當它再次進入操作系統,在屏幕上進行合成時, 可能需要一個alpha層, 格式又要變。 這中間涉及到大量轉換和複製步驟。 由此可見, 無論內容來自攝像頭還是某一終端,僅僅把它放到屏幕上的視頻幀中就要花費大量的處理時間。 所以這就是WebRTC服務中最複雜的部分——渲染。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/52\/52a0a20868f217f983138aefcef6e880.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"這也是我們做了很多優化的地方。 渲染變得更加高效了,可以確保我們不會在每次更新視頻幀時都重新繪製。 如果同時有多個視頻,我們會把他們同步,再做其他操作。Chrome團隊優化了內存分配,確保每個視頻幀都以最有效的方式得到分配。我們還改進了Chrome OS上的操作系統調度,以確保視頻服務即使負載過重也能保證交互和響應。接下來的幾個月裏,我們將致力於從攝像頭採集到視頻幀渲染到屏幕這個過程的“零拷貝”。 我們希望不會出現一次拷貝或轉換,但所有信息都會以最高效的方式保存在圖片內存裏的。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"同時,我們也致力於使刷新率適應視頻幀率。所以在沒有任何變化的情況下,我們不需要60Hz 的屏幕刷新率,但要適應視頻的幀速率,例如25秒一次。 以下是我們覺得有用的建議:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"1、避免耗時耗力的擴展操作,在incongnito模式下進行測試。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"避免耗時耗力的擴展操作很難,它可以干擾你的服務進程,減緩你的服務速度。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"2、避免安全程序干擾瀏覽器運行"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"殺毒軟件若要做深度數據包檢查或阻止數據包,會佔用大量CPU。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"3、通過 Intel Power Gadgets 來測試"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我們建議你用Intel Power Gadgets看看你的服務用了多少能耗。 它會比只看CPU百分比直觀的多。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"4、花哨的視頻效果會佔用更多性能"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"如果你用一些花哨的動畫, 比如會動的圓點來裝飾你的視頻幀,就會佔用更多性能。 儘管看起來不錯,但它可能會導致視頻幀卡頓一番才能渲染在屏幕上。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"5、攝像頭分辨率設置與傳輸分辨率一致"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"如果你使用攝像頭採集,請確保打開攝像頭時將其分辨率的設置,與你調用getUserMedia時的設置保持一致。 如果你打開攝像頭,設置了高清畫質,格式爲VGA,那麼勢必需要轉換很多字節的信息都會被扔掉。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"6、要留意 WebAudio 的使用"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"WebAudio可能比預期需要更多CPU來處理。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"關於視頻編解碼"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"視頻編解碼器可用於構建更高性能服務器。 因爲不僅CPU資源很重要, 若你構建網絡相關的服務,視頻編解碼器就顯得重要起來了。 如果你要把業務拓展一百倍, Google提供一款免費的編解碼器,VP8、VP9、AV1,並且他在所有瀏覽器中都可用。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/22\/226e6483f6e6ba4bce09f24238bff9a2.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"VP8是目前爲止瀏覽器內部使用最多的版本,所有瀏覽器都支持它。VP9同樣在市場中流通很多年了,也一直在改進。它具備30%-50%的節約率,以及如支持HDR和4K的高質量功能。同時,它廣泛應用於谷歌內部,支持Stadia及其他內部服務。 因爲它有VP8中沒有的功能,即能幫助你更好地適應高低帶寬連接的轉換。然後是AV1。AV1也即將在WebRTC、一些開源實現和瀏覽器中使用。大多數瀏覽器已經可以使用它進行流式傳輸。 希望明年能正式啓用它。 實際上,微軟剛剛宣佈他們的操作系統將支持硬件加速AV1。 性能的提升給予了開發者更大空間。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"WebRTC NV(Next Version)"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"發佈WebRTC 1.0之後,我們就和社區一起研發下一個版本, 該版本叫“NV”。 該版本意在支持當前WebRTC API不可能或很難實現的新用例,比如虛擬現實。對於虛擬現實特效,就像前面提到過的筆記本電腦和機器學習的例子一樣, 爲了能夠使用WebRTC API運行,我們需要更好地掌握媒體處理的技能, 比如更好控制傳輸和擁塞,使用編解碼器進行更多自定義操作等等。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在以上這些目標上,WebRTC NV的思路是不定義全新API。 目前已經有兩個API和WebRTC,PeerConnetion和getUserMedia了。 我們不會重新定義它們,從頭開始研發。相反,我們正在做的是:允許我們使用稱爲“HTML流”的接口訪問端對peer connection內部,以及允許訪問瀏覽器中的編解碼器的接口。再加上諸如Web Assembly和workers threads的新技術,你可以在瀏覽器,以及集成的端對端連接中使用Javascript進行實時處理。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"如果看一下現在的WebRTC的內部,你會發現媒體流就像是從網絡傳入時一樣被拆包(depacketized)。這裏會有一些丟失或延遲的適配。因此,我們對此進行了重構。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"另一方面, 攝像頭輸入或麥克風輸入已經經過編解碼器如Opus或VP8,去除了回聲。比特率已經根據網絡情況進行了適配,然後將其打包爲RTP數據包並通過網絡發送。我們想做到在WebRTC NV中攔截該管道,所以要從媒體框架開始。因此,我們希望能夠在媒體幀從網絡到達顯示器,以及從攝像機麥克風到網絡回到媒體幀時對其進行監聽。我們希望能夠更好地管理這些流。目前我們提出兩個流方案,也正是我致力研究的兩個API。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/a3\/a3f47412957aced5cc79e9b832203c3c.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"第一個是可插入媒體流(Insertable Media Stream)。當前的Chrome瀏覽器86中已提供此功能。Google服務和其他外部服務已使用了此功能。你可以使用它來實現端到端加密,或者可以使用它向框架中添加自定義元數據(meta-data)。你要做的是在PeerConnection中定義此編碼的可插入媒體流,並且你也可以創建流。之後,當你從攝像頭獲取視頻幀時,它首先被編碼,比如VP8格式,之後你可以訪問它並進行流式處理。你還可以對其進行加密或標記其中一些元數據。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"另一個是原始媒體流 API(Raw Media Stream)。這是標準委員會正在討論的標準工作。目前已經有一些確切的建議了。從Google的角度來說,我們正在嘗試這種實現。該API允許我們訪問原始幀。它意味着,當原始幀從攝像頭採集後,在還未進行視頻編碼前,你就可以訪問這些數據了。然後你可以對其進行處理,比如實現 AR 效果。你還可以運行多個濾鏡來刪除背景,然後應用一些效果。比如我想把我現在的視頻背景設成一個熱帶島嶼。這還可以應用到自定義的編解碼器中,比如你此前使用的一些編解碼器與現在的瀏覽器不兼容,那麼你可以利用這個接口將數據直接傳給編解碼器來處理。原始媒體流API可以提供一種非常有效的方式來訪問此原始媒體。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"總結一下。雖然WebRTC作爲W3C正式標準已經發布,但仍在繼續改進。新的視頻編解碼器AV1可節省多達50%的帶寬,正在WebRTC和網絡瀏覽器中使用。開源代碼的持續改進有望進一步減少延遲,提高視頻流的質量。WebRTC NV收集了創建補充API的倡議,以實現新的用例。這些API包括對現有API的擴展,以提供更多對現有功能的控制,例如可擴展視頻編碼,以及提供對low-level組件的訪問的API。後者通過集成高性能的定製WebAssembly組件,爲網絡開發者提供了更多的創新靈活性。隨着新興的5G網絡和對更多交互式服務的需求,我們預計在未來一年內,持續增強在WebRTC的服務端建設。"}]}]}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章