谷歌聯合哈佛大學發佈最新研究,使用NeRF創建360度完整神經場景視頻

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Google Research與哈佛大學最新的合作研究,提出了一種稱爲“Mip-NeRF 360”的新方法。該方法使用NeRF("},{"type":"link","attrs":{"href":"https:\/\/www.unite.ai\/tag\/nerf\/","title":null,"type":null},"content":[{"type":"text","text":"Neural Radiance Fields"}]},{"type":"text","text":")創建360度完整神經場景(neural scene)的視頻,進一步推動了NeRF適用於在任何環境中隨意抽象,不再受限於"},{"type":"link","attrs":{"href":"https:\/\/www.unite.ai\/nerfactor-another-step-to-replacing-cgi\/","title":null,"type":null},"content":[{"type":"text","text":"桌面模型"}]},{"type":"text","text":"或"},{"type":"link","attrs":{"href":"https:\/\/www.unite.ai\/st-nerf-compositing-and-editing-for-video-synthesis\/","title":null,"type":null},"content":[{"type":"text","text":"封閉室內場景"}]},{"type":"text","text":"。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"不同於大多數前期方法,Mip-NeRF 360給定了對光線的解釋方式,並通過建立關注區域邊界降低了原本冗長的訓練時間,實現可處理背景的擴展和天空這樣的“非受限”場景。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"新論文的標題爲“"},{"type":"link","attrs":{"href":"https:\/\/arxiv.org\/pdf\/2111.12077.pdf","title":null,"type":null},"content":[{"type":"text","text":"Mip-NeRF 360: Unbounded Anti-Aliased Neural Radiance Fields"}]},{"type":"text","text":"”,由Google Research高級研究科學家Jon Barron牽頭完成的。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"爲深入理解該論文的技術突破,首先對基於NeRF的圖像生成做一個基礎的闡釋。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"什麼是NeRF?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"NeRF網絡並非真正地去描述一個視頻,而是使用對單張照片和視頻各幀的多個視角拼接出場景,因此更類似於一種基於AI實現的完全3D虛擬環境。該場景從技術上看只存在於"},{"type":"link","attrs":{"href":"https:\/\/www.unite.ai\/what-is-machine-learning\/","title":null,"type":null},"content":[{"type":"text","text":"機器學習"}]},{"type":"text","text":"算法的隱空間(latent space),但可從中任意抽取出大量的視角和視頻。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/65\/d5\/65f420c57747yy05308cb8038469b2d5.gif","alt":null,"title":"圖1 多攝像頭捕獲點示意圖(左圖);NeRF獲取各捕獲點,並拼接出神經場景(右圖)","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"給定一張照片,通過訓練其中的信息,生成一個類似於傳統CGI工作流中"},{"type":"link","attrs":{"href":"http:\/\/www.open3d.org\/docs\/release\/python_api\/open3d.geometry.VoxelGrid.html","title":null,"type":null},"content":[{"type":"text","text":"體素網格(Voxel grids)"}]},{"type":"text","text":"的矩陣。矩陣中爲3D空間中的每個點賦予了一個值,形成可被訪問的場景。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/83\/f4\/8377b4caed4e5d866591d6f1319c3df4.jpeg","alt":null,"title":"圖2:體素矩陣示例,其中以三維空間存儲像素信息。像素通常採用二維形式表示,例如JPEG文件的像素網格。圖片來源:ResearchGate。","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"該方法在完成各照片間必要的間質空間計算後,通過“光線追蹤”確定光照路徑上每張照片的每個可能像素點,並對其分配一個顏色值和透明度值。如果沒有指定透明度,那麼神經矩陣可能是完全不透明的,也可能是完爲空的。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"NeRF矩陣與基於CGI的三維座標空間不同,但與體素網格類似,其中的“封閉”對象並不存在任何內部表示。例如,一個架子鼓對象在CGI中是可以拆開查看其內部的,但在NeRF中一旦將該對象的表面不透明度值設置爲1,那麼這臺架子鼓就會消失。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"像素視角的擴展"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Mip-NeRF 360是對"},{"type":"link","attrs":{"href":"https:\/\/jonbarron.info\/mipnerf\/","title":null,"type":null},"content":[{"type":"text","text":"2021年3月發表的一項研究"}]},{"type":"text","text":"的進一步拓展。該研究提出的Mip-NeRF方法通過在NeRF中引入有效的抗鋸齒,避免做過量的超採樣(supersampling)。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"NeRF一般只計算單條像素路徑,易於產生早期互聯網圖像格式和"},{"type":"link","attrs":{"href":"https:\/\/www.ign.com\/articles\/2000\/07\/01\/ps2-aliased-no-more","title":null,"type":null},"content":[{"type":"text","text":"遊戲系統"}]},{"type":"text","text":"中所特有的“"},{"type":"link","attrs":{"href":"https:\/\/www.dell.com\/community\/Laptops-General-Read-Only\/Jaggies-on-IE-Web-Graphics\/td-p\/1822273","title":null,"type":null},"content":[{"type":"text","text":"鋸齒感"}]},{"type":"text","text":"”。爲消除鋸齒感邊緣,已有方法通常是對相鄰像素進行採樣,並給出平均表示。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"針對傳統NeRF僅對單條像素路徑採樣,Mip-NeRF提出了一種類似寬光束手電筒的“錐形”彙集區,對相關相鄰像素提供了充分的信息,形成細節改進的低代價抗鋸齒方法。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/f9\/c1\/f98d3fcfcd94f23b42f292ff1c7aa9c1.jpeg","alt":null,"title":"圖3 Mip-NeRF使用的“錐形”彙集區被切片成視錐(下圖),並做進一步的模糊化處理,生成用於計算像素精度和鋸齒的高斯空間。圖片來源:https:\/\/www.youtube.com\/watch?v=EpH175PY1A0","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"該方法顯著改進了標準NeRF實現,如下圖所示:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/36\/8d\/362217282c11851f257ffbb4c09d9e8d.jpeg","alt":null,"title":"圖4 發表於2021年3月的Mip-NeRF方法(右圖)。它通過更全面和低代價的鋸齒流水線而非對像素的模糊化處理,實現細節改進,避免邊緣產生鋸齒狀。圖片來源:https:\/\/jonbarron.info\/mipnerf\/","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"無界NeRF"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"但Mip-NeRF依然存在三個尚未解決的問題。首先,要應用於天空這樣的無界環境中,其中可能包含超遠距離的對象。Mip-NeRF 360通過對Mip-NeRF高斯空間應用"},{"type":"link","attrs":{"href":"https:\/\/asmedigitalcollection.asme.org\/fluidsengineering\/article-abstract\/82\/1\/35\/397706\/A-New-Approach-to-Linear-Filtering-and-Prediction?redirectedFrom=fulltext","title":null,"type":null},"content":[{"type":"text","text":"Kalman扭曲"}]},{"type":"text","text":"解決了該問題。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"第二,更大的場景需要更高的處理能力和更長的訓練時間。爲解決該問題,Mip-NeRF 360使用小規模“提議”"},{"type":"link","attrs":{"href":"https:\/\/www.unite.ai\/what-are-neural-networks\/","title":null,"type":null},"content":[{"type":"text","text":"多層感知器(MLP,multi-layer perceptron"}]},{"type":"text","text":")去“提煉”場景的幾何形狀。MLP根據大規模標準NeRF MLP預測的幾何形狀,預先限定了當前形狀範圍,將訓練速度提高了三倍。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"第三,更大的場景往往會導致需解構幾何體的離散化存在模糊不清的問題,進而導致輸出遊戲玩家可能非常熟知的“畫面撕裂”僞影。Mip-NeRF 360通過新建對Mip-NeRF射線間隔的正則化處理而解決了該問題。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/16\/34\/1691d8a4bd95a647c6b84d9493862334.jpeg","alt":null,"title":"圖5 圖右側使用Mip-NeRF,難以對如此規模的場景進行界定,因此產生了不必要的僞影。圖左側使用了新的正則化處理,完全可優化消除這些干擾。","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"原文鏈接:"},{"type":"text","text":" "},{"type":"link","attrs":{"href":"https:\/\/www.unite.ai\/neural-rendering-nerf-takes-a-walk-in-the-fresh-air\/","title":null,"type":null},"content":[{"type":"text","text":"Neural Rendering: NeRF Takes a Walk in the Fresh Air"}]}]}]}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章