Unity 性能優化 Draw Call


在Unity中,每次引擎準備數據並通知GPU的過程稱爲一次Draw Call。這一過程是逐個物體進行的,對於每個物體,不只GPU的渲染,引擎重新設置材質/Shader也是一項非常耗時的操作。因此每幀的Draw Call次數是一項非常重要的性能指標,對於iOS來說應儘量控制在20次以內,這個值可以在編輯器的Statistic窗口看到。

Unity內置了Draw Call Batching技術,從名字就可以看出,它的主要目標就是在一次Draw Call中批量處理多個物體。只要物體的變換和材質相同,GPU就可以按完全相同的方式進行處理,即可以把它們放在一個Draw Call中。Draw Call Batching技術的核心就是在可見性測試之後,檢查所有要繪製的物體的材質,把相同材質的分爲一組(一個Batch),然後把它們組合成一個物體(統一變換),這樣就可以在一個Draw Call中處理多個物體了(實際上是組合後的一個物體)。

但Draw Call Batching存在一個缺陷,就是它需要把一個Batch中的所有物體組合到一起,相當於創建了一個與這些物體加起來一樣大的物體,與此同時就需要分配相應大小的內存。這不僅會消耗更多內存,還需要消耗CPU時間。特別是對於移動的物體,每一幀都得重新進行組合,這就需要進行一些權衡,否則得不償失。但對於靜止不動的物體來說,只需要進行一次組合,之後就可以一直使用,效率要高得多。

Unity提供了Dynamic Batching和Static Batching兩種方式。Dynamic Batching是完全自動進行的,不需要也無法進行任何干預,對於頂點數在300以內的可移動物體,只要使用相同的材質,就會組成Batch。Static Batching則需要把靜止的物體標記爲Static,然後無論大小,都會組成Batch。如前文所說,Static Batching顯然比Dynamic Batching要高效得多,於是,Static Batching功能是收費的……

要有效利用Draw Call Batching,首先是儘量減少場景中使用的材質數量,即儘量共享材質,對於僅紋理不同的材質可以把紋理組合到一張更大的紋理中(稱爲Texture Atlasing)。然後是把不會移動的物體標記爲Static。此外還可以通過CombineChildren腳本(Standard Assets/Scripts/Unity Scripts/CombineChildren)手動把物體組合在一起,但這個腳本會影響可見性測試,因爲組合在一起的物體始終會被看作一個物體,從而會增加GPU要處理的幾何體數量,因此要小心使用。

對於複雜的靜態場景,還可以考慮自行設計遮擋剔除算法,減少可見的物體數量同時也可以減少Draw Call。

總之,理解Draw Call和Draw Call Batching原理,根據場景特點設計相應的方案來儘量減少Draw Call次數纔是王道,其它方面亦然。

Draw Call Batching (繪製調用批處理)


To draw an object on the screen, the engine has to issue a draw call to the graphics API (OpenGL ES in the case of iOS). Every single draw call requires a significant amount of work on the part of the graphics API, causing significant performance overhead on the CPU side.

在屏幕上渲染物體,引擎需要發出一個繪製調用來訪問圖形API(iOS系統中爲OpenGL ES)。每個繪製調用需要進行大量的工作來訪問圖形API,從而導致了CPU方面顯著的性能開銷。


Unity combines a number of objects at runtime and draws them together with a single draw call. This operation is called "batching". The more objects Unity can batch together, the better rendering performance you will get.



Built-in batching support in Unity has significant benefit over simply combining geometry in the modeling tool (or using theCombineChildren script from the Standard Assets package). Batching in Unity happensafter visibility determination step. The engine does culling on each object individually, and the amount of rendered geometry is going to be the same as without batching. Combining geometry in the modeling tool, on the other hand, prevents effecient culling and results in much higher amount of geometry being rendered.

Unity中內建的批處理機制所達到的效果要明顯強於使用幾何建模工具(或使用Standard Assets包中的CombineChildren腳本)的批處理效果。這是因爲,Unity引擎的批處理操作是在物體的可視裁剪操作之後進行的。Unity先對每個物體進行裁剪,然後再進行批處理,這樣可以使渲染的幾何總量在批處理前後保持不變。但是,使用幾何建模工具來拼合物體,會妨礙引擎對其進行有效的裁剪操作,從而導致引擎需要渲染更多的幾何面片。




Only objects sharing the same material can be batched together. Therefore, if you want to achieve good batching, you need to share as many materials among different objects as possible.



If you have two identical materials which differ only in textures, you can combine those textures into a single big texture - a process often calledtexture atlasing. Once textures are in the same atlas, you can use single material instead.

如果你的兩個材質僅僅是紋理不同,那麼你可以通過 紋理拼合 操作來將這兩張紋理拼合成一張大的紋理。一旦紋理拼合在一起,你就可以使用這個單一材質來替代之前的兩個材質了。


If you need to access shared material properties from the scripts, then it is important to note that modifyingRenderer.material will create a copy of the material. Instead, you should useRenderer.sharedMaterial to keep material shared.



Dynamic Batching


Unity can automatically batch moving objects into the same draw call if they share the same material.



Dynamic batching is done automatically and does not require any additional effort on your side.





1、      Batching dynamic objects has certain overheadper vertex, so batching is applied only to meshes containing less than900 vertex attributes in total.



2、      If your shader is using Vertex Position, Normal and single UV, then you can batch up to 300 verts and if your shader is using Vertex Position, Normal, UV0, UV1 and

            Tangent, then only 180 verts.

            Please note: attribute count limit might be changed in future





4、      Don't use scale. Objects with scale (1,1,1) and (2,2,2) won't batch.



5、      Uniformly scaled objects won't be batched with non-uniformly scaled ones.


           Objects with scale (1,1,1) and (1,2,1) won't be batched. On the other hand (1,2,1) and (1,3,1) will be.

           使用縮放尺度(1,1,1) (1,2,1)的兩個物體將不會進行批處理,但是使用縮放尺度(1,2,1)(1,3,1)的兩個物體將可以進行批處理。


6、     Using different material instances will cause batching to fail.



7、     Objects with lightmaps have additional (hidden) material parameter: offset/scale in lightmap, so lightmapped objects won't be batched (unless they point to same

           portions of lightmap)




8、     Multi-pass shaders will break batching. E.g. Almost all unity shaders supports several lights in forward rendering, effectively doing additional pass for them



9、     Using instances of a prefab automatically are using the same mesh and material.



Static Batching



Static batching, on the other hand, allows the engine to reduce draw calls for geometry of any size (provided it does not move and shares the same material). Static batching is significantly more efficient than dynamic batching. You should choose static batching as it will require less CPU power.



In order to take advantage of static batching, you need explicitly specify that certain objects are static and willnot move, rotate or scale in the game. To do so, you can mark objects as static using the Static checkbox in the Inspector:



Using static batching will require additional memory for storing the combined geometry. If several objects shared the same geometry before static batching, then a copy of geometry will be created for each object, either in the Editor or at runtime. This might not always be a good idea - sometimes you will have to sacrifice rendering performance by avoiding static batching for some objects to keep a smaller memory footprint. For example, marking trees as static in a dense forest level can have serious memory impact.



Static batching is only available in Unity iOS Advanced.

靜態批處理目前只支持Unity iOS Advanced。






Alpha blending

在Unity官方文檔中講,由於硬件原因,在iOS設備上使用alpha-test會造成很大的性能開銷,應儘量使用alpha-blend代替。這裏提到,在同屏使用alpha-blend的面數,尤其是這些面所佔屏幕面積的大小,對性能也會造成很大影響。原因是使用alpha-blend的面會造成overdraw的增加,這尤其對低性能設備的影響很大。不過沒有購買Pro版,沒有Occlusion Culling功能的話,就不必顧慮這一問題了,反正overdraw是必然的。

複雜的Per-pixel shader

Per-pixel shader即Fragment shader,顧名思義是要對每個渲染到屏幕上的像素做處理的shader,如果per-pixel shader比較複雜且需要處理的像素很多時,也就是使用該shader的面佔屏幕面積很大時,對性能的影響甚至要超過alpha blending。因此複雜的per-pixel shader只適用於小物體。


Environment specular maps(Shader Virtual Gloss Per Vertex Additive)

Specular map通常都是利用貼圖的alpha通道來定義物體表面的光滑程度(反光度),這個shader的特點是per-vertex計算反光度的,有着相當不錯的效果的同時比per-pixel的shader性能要高得多。這個shader很適用於關卡環境等佔很大區域的模型。

經過優化的動態角色光照和陰影(Light probes和BRDF Shader)

傳統的Lightmaps無法支持動態物體,對此Unity提供了Light probes技術,預先把動態物體的光照信息保存在代理對象(即Light probes)中,運行時動態物體從距離最近的Probe中獲取光照信息。

Unity本身還提供了一個效果非常棒的專爲移動設備優化過的角色Shader,支持Diffuse、Specular和Normal maps,並通過一個特殊的腳本生成貼圖用於模仿BRDF光照效果。最終產生的效果堪比次時代大作中的角色光影效果。

霧和體積光(Shader Blinking Godrays)

目前在移動設備上要開啓真正的霧效基本不可行,ShadowGun的方案是通過簡單的網格+透明貼圖(稱爲霧面)來模擬霧效。在玩家靠近時,霧面逐漸變淡,同時fog plane的頂點也會移開(即使完全透明的alpha面也會消耗很多渲染時間)。


  • 頂點的alpha值用於決定頂點是否可以移動(在例子中0爲不可動,1爲可動)。
  • 頂點法線決定移動的方向
  • 然後Shader通過計算與觀察者的距離來控制霧面的淡入/淡出。


飛機墜毀的濃煙效果(Shader Scroll 2 Layers Sine Alpha-blended)


帶動態效果的天空盒(Shader Scroll 2 Layers Multiplicative)


旗幟和衣服的飄動效果(Shader Lightmap + Wind)



