[Unity3D]總結使用Unity 3D優化遊戲運行性能的經驗

作者:Amir Fasshihi

流暢的遊戲玩法來自流暢的幀率,而我們即將推出的動作平臺遊戲《Shadow Blade》已經將在標準iPhone和iPad設備上實現每秒60幀視爲一個重要目標。

以下是我們在緊湊的優化過程中提升遊戲運行性能,並實現目標幀率時需要考慮的事項。

當基本遊戲功能到位時,就要確保遊戲運行表現能夠達標。我們衡量遊戲運行表現的一個基本工具是Unity內置分析器以及Xcode分析工具。使用Unity分析器來分析設備上的運行代碼真是一項寶貴的功能。

我們總結了這種爲將目標設備的幀率控制在60fps而進行衡量、調整、再衡量過程的中相關經驗。

shadow blade(from deadmage.com)

shadow blade(from deadmage.com)

一、遇到麻煩時要調用“垃圾回收器”(Garbage Collector,無用單元收集程序,以下簡稱GC)

由於具有C/C++遊戲編程背景,我們並不習慣無用單元收集程序的特定行爲。確保自動清理你不用的內存,這種做法在剛開始時很好,但很快你就公發現自己的分析器經常顯示CPU負荷過大,原因是垃圾回收器正在收集垃圾內存。這對移動設備來說尤其是個大問題。要跟進內存分配,並儘量避免它們成爲優先數,以下是我們應該採取的主要操作:

1.移除代碼中的任何字符串連接,因爲這會給GC留下大量垃圾。

2.用簡單的“for”循環代替“foreach”循環。由於某些原因,每個“foreach”循環的每次迭代會生成24字節的垃圾內存。一個簡單的循環迭代10次就可以留下240字節的垃圾內存。

3.更改我們檢查遊戲對象標籤的方法。用“if (go.CompareTag (“Enemy”)”來代替“if (go.tag == “Enemy”)” 。在一個內部循環調用對象分配的標籤屬性以及拷貝額外內存,這是一個非常糟糕的做法。

4.對象庫很棒,我們爲所有動態遊戲對象製作和使用庫,這樣在遊戲運行時間內不會動態分配任何東西,不需要的時候所有東西反向循環到庫中。

5.不使用LINQ命令,因爲它們一般會分配中間緩器,而這很容易生成垃圾內存。

二、謹慎處理高級腳本和本地引擎C++代碼之間的通信開銷。

所有使用Unity3D編寫的遊戲玩法代碼都是腳本代碼,在我們的項目中是使用Mono執行時間處理的C#代碼。任何與引擎數據的通信需求都要有一個進入高級腳本語言的本地引擎代碼的調用。這當然會產生它自己的開銷,而儘量減少遊戲代碼中的這些調用則要排在第二位。

1.在這一情景中四處移動對象要求來自腳本代碼的調用進入引擎代碼,這樣我們就會在遊戲玩法代碼的一個幀中緩存某一對象的轉換需求,並一次僅向引擎發送一個請求,以便減少調用開銷。這種模式也適用於其他相似的地方,而不僅侷限於移動和旋轉對象。

2.將引用本地緩存到元件中會減少每次在一個遊戲對象中使用 “GetComponent” 獲取一個元件引用的需求,這是調用本地引擎代碼的另一個例子。

三、物理效果

1.將物理模擬時間步設置到最小化狀態。在我們的項目中就不可以將讓它低於16毫秒。

2.減少角色控制器移動命令的調用。移動角色控制器會同步發生,每次調用都會耗損極大的性能。我們的做法是緩存每幀的移動請求,並且僅運用一次。

3.修改代碼以免依賴“ControllerColliderHit” 回調函數。這證明這些回調函數處理得並不十分迅速。

4.面對性能更弱的設備,要用skinned mesh代替physics cloth。cloth參數在運行表現中發揮重要作用,如果你肯花些時間找到美學與運行表現之間的平衡點,就可以獲得理想的結果。

5.在物理模擬過程中不要使用ragdolls,只有在必要時才讓它生效。

6.要謹慎評估觸發器的“onInside”回調函數,在我們的項目中,我們儘量在不依賴它們的情況下模擬邏輯。

7.使用層次而不是標籤。我們可以輕鬆爲對象分配層次和標籤,並查詢特定對象,但是涉及碰撞邏輯時,層次至少在運行表現上會更有明顯優勢。更快的物理計算和更少的無用分配內存是使用層次的基本原因。

8.千萬不要使用Mesh對撞機。

9.最小化碰撞檢測請求(例如ray casts和sphere checks),儘量從每次檢查中獲得更多信息。

四、讓AI代碼更迅速

我們使用AI敵人來阻攔忍者英雄,並同其過招。以下是與AI性能問題有關的一些建議:

1.AI邏輯(例如能見度檢查等)會生成大量物理查詢。可以讓AI更新循環設置低於圖像更新循環,以減少CPU負荷。

五、最佳性能表現根本就不是來自代碼!

沒有發生什麼情況的時候,就說明性能良好。這是我們關閉一切不必要之物的基本原則。我們的項目是一個側邊橫向卷軸動作遊戲,所以如果不具有可視性時,就可以關閉許多動態關卡物體。

1.使用細節層次的定製關卡將遠處的敵人AI關閉。

2.移動平臺和障礙,當它們遠去時其物理碰撞機也會關閉。

3.Unity內置的“動畫挑選”系統可以用來關閉未被渲染對象的動畫。

4.所有關卡內的粒子系統也可以使用同樣的禁用機制。

六、回調函數!那麼空白的回調函數呢?

要儘量減少Unity回調函數。即使敵人回調函數存在性能損失。沒有必要將空白的回調函數留在代碼庫中(有時候介於大量代碼重寫和重構之間)。

七、讓美術人員來救場

在程序員抓耳撓腮,絞盡腦汁去想該如何讓每秒運行更多幀時,美術人員總能神奇地派上大用場。

1.共享遊戲對象材料,令其在Unity中處於靜止狀態,可以讓它們綁定在一起,由此產生的簡化繪圖調用是呈現良好移動運行性能的重要元素。

2.紋理地圖集對UI元素來說尤其有用。

3.方形紋理以及兩者功率的合理壓縮是必不可少的步驟。

4.我們的美術人員移除了所有遠處背景的網格,並將其轉化爲簡單的2D位面。

5.光照圖非常有價值。

6.我們的美術人員在一些關口移除了額外頂點。

7.使用合理的紋理mip標準是一個好主意(遊戲邦注:要讓不同分辨率的設備呈現良好的幀率時尤其如此)。

8.結合網格是美術人員可以發揮作用的另一個操作。

9.我們的動畫師盡力讓不同角色共享動畫。

10.要找到美學/性能之間的平衡,就免不了許多粒子效果的迭代。減少發射器數量並儘量減少透明度需求也是一大挑戰。

八、要減少內存使用

使用大內存當然會對性能產生負面影響,但在我們的項目中,我們的iPod由於超過內存上限而遭遇了多次崩潰事件。我們的遊戲中最耗內存的是紋理。

1.不同設備要使用不同的紋理大小,尤其是UI和大型背景中的紋理。《Shadow Blade》使用的是通用型模板,但如果在啓動時檢測到設備大小和分辨率,就會載入不同資產。

2.我們要確保未使用的資產不會載入內存。我們必須遲一點在項目中找到僅被一個預製件實例引用,並且從未完全載入內存中實例化的資產。

3.去除網格中的額外多邊形也能實現這一點。

4.我們應該重建一些資產的生週期管理。例如,調整主菜單資產的加載/卸載時間,或者關卡資產、遊戲音樂的有效期限。

5.每個關卡都要有根據其動態對象需求而量身定製的特定對象庫,並根據最小內存需求來優化。對象庫可以靈活一點,在開發過程中包含大量對象,但知道遊戲對象需求後就要具體一點。

6.保持聲音文件在內存的壓縮狀態也是必要之舉。

加強遊戲運行性能是一個漫長而具有挑戰性的過程,遊戲開發社區所分享的大量知識,以及Unity提供的出色分析工具爲《Shadow Blade》實現目標運行性能提供了極大幫助。(本文爲遊戲邦/gamerboom.com編譯,拒絕任何不保留版權的轉載,如需轉載請聯繫:遊戲邦

“0 – 60 fps in 14 days!” What we learned trying to optimize our game using Unity3D.

by Amir Fassihi

The following blog post, unless otherwise noted, was written by a member of Gamasutra’s community.

The thoughts and opinions expressed are those of the writer and not Gamasutra or its parent company.

A smooth gameplay is built upon the foundations of a smooth frame rate and hitting the 60 frames per second target on the standard iPhone and iPad devices was a significant goal during the development of our upcoming action platformer game, Shadow Blade. (http://shadowblade.deadmage.com)

The following is a summary from the things we had to consider and change in the game in order to increase the performance and reach the targeted frame rate during the intense optimization sessions.

Once the basic game functionalities were in place, it was time to make sure the game performance would meet its target. Our main tool for measuring the performance was the built-in Unity profiler and the Xcode profiling tools. Being able to profile the running code on the device using the Unity profiler proved to be an invaluable feature.

So here goes our summary and what we learned about the results of this intense measuring, tweaking and re-measuring journey which paid out well at the end and resulted in a fixed 60fps for our target devices.

1 – Head to head with a ferocious monster called the Garbage Collector.

Coming from a C/C++ game programming background, we were not used to the specific behaviors of the garbage collector. Making sure your unused memory is cleaned up automatically for you is nice at first but soon the reality kicks in and you witness regular spikes in your profiler showing the CPU load caused by the garbage collector doing what it is supposed to do, collecting the garbage memory. This proved to be a huge issue specifically for the mobile devices. Chasing down memory allocations and trying to eliminate them became priority number one and here are some of the main actions we took:

Remove any string concatenation in code since this leaves a lot of garbage for the GC to collect.

Replace the “foreach” loops with simple “for” loops. For some reason, every iteration of every “foreach” loop generated 24 Bytes of garbage memory. A simple loop iterating 10 times left 240 Bytes of memory ready to be collected which was just unacceptable

Replace the way we checked for game object tags. Instead of “if (go.tag == “Enemy”)” we used “if (go.CompareTag (“Enemy”)”. Calling the tag property on an object allocates and copies additional memory and this is really bad if such a check resides in an inner loop.

Object pools are great, we made and used pools for all dynamic game objects so that nothing is ever allocated dynamically during the game runtime in the middle of the levels and everything is recycled back to the pool when not needed.

Not using LINQ commands since they tended to allocate intermediate buffers, food for the GC.

2 – Careful with the communication overhead between high level scripts and native engine C++ code.

All gameplay code written for a game using Unity3D is script code which in our case was C# that was handled using the Mono runtime. Any requirements to communicate with the engine data would require a call into the native engine code from the high level scripting language. This of course has its own overhead and trying to reduce such calls in game code was the second priority.

Moving objects around in the scene requires calls from the script code to the engine code and we ended up caching the transformation requirements for an object during a frame in the gameplay code and sending the request to the engine only once to reduce the call overhead. This pattern was used in other similar places other than the needs to move and rotate an object.

Caching references to components locally would eliminate the need to fetch a component reference using the “GetComponent” method on a game object every time which is another example for a call into the native engine code.

3 – Physics, Physics and more Physics.

Setting the physics simulation timestep to the minimum possible. For our case we could not set it lower than 16 milliseconds.

Reducing calls to character controller move commands. Moving the character controller happens synchronously and every call can have a significant performance cost. What we did was to cache the movement requests per frame and apply them only once.

Modifying code to not rely on the “ControllerColliderHit” callbacks. It proved that these callbacks are not handled very quickly.

Replacing the physics cloth with a skinned mesh for the weaker devices. The cloth parameters can play important roles in performance also and it pays off to spend some time to find the appropriate balance between aesthetics and performance.

Ragdolls were disabled so that they were not part of the physics simulation loop and only enabled when necessary.

“OnInside” callbacks of the triggers need to be assessed carefully and in our case we tried to model the logic without relying on them if possible.

Layers instead of tags! Layers and tags can be assigned to objects easily and used for querying specific objects, however, layers have a definite advantage at least performance wise when it comes to working with collision logic. Quicker physics calculations and less unwanted newly allocated memory are the basic reasons.

Mesh colliders are definitely a no-no.

Minimize collision detection requests like ray casts and sphere checks in general and try to get as much information from each check.

4 – Let’s make the AI code faster!

We use artificial intelligence for the enemies that try to block our main ninja hero and fight with him. The following topics needed to be covered regarding AI performance issues:

A lot of physical queries are generated from AI logic like visibility checks. The AI update loop could be set to something much lower than the graphics update loop to reduce CPU load.

5 – Best performance is achieved from no code at ALL!

When nothing happens, performance is good. This was the base philosophy for us to try and turn anything not necessary at the moment off. Our game is a side scroller action game and so a lot of the dynamic level objects can be turned off when they are not visible in the scene.

Enemy AI was turned off when far away using a custom level of detail scheme.

Moving platforms and hazards and their physics colliders were turned off when far away.

Built in Unity “animation culling” system was used to turn off animations on objects not being rendered.

Same disabling mechanism used for all in level particle systems.

6 – Callback! How about empty callbacks?

The Unity callbacks needed to be reduced as much as possible. Even the empty callbacks had performance penalties. There is no reason for having empty callbacks but they just get left in the code base sometimes in between a lot of code rewrite and refactoring.

7 – The mighty Artists to the rescue.

Artists can always magically help out the hair-pulling programmer trying to go for a few more frames per second.

Sharing materials for game objects and making them static in Unity causes them to be batched together and the resulting reduced draw calls are critical for good mobile performance.
Texture atlases helped a lot especially for the UI elements.

Square textures and power of two with proper compression was a must.

Being a side-scroller enabled our artists to remove all far background meshes and convert them to simple 2D planes instead.

Light maps were highly valuable.

Our artists removed extra vertices during a few passes.

Proper texture mip levels were a good decision especially for having a good frame rate on devices with different resolutions.

Combining meshes was another performance friendly action by the artists.

Our animator tried to share animations between different characters if it was possible.

A lot of iterations on the particles were necessary to find the aesthetic/performance balance. Reducing number of emitters and trying to reduce transparency requirements were among the major challenges.

8 – The memory usage needs to be reduced, now!

Using a lot of memory of course has negative performance related effects but in our case we experienced a lot of crashes on iPods due to exceeding memory limits which was a much more critical problem. The biggest memory consumers in our game were the textures.

Different texture sizes were used for different devices, especially textures used in UI and large backgrounds. Shadow Blade uses a universal build but different assets get loaded when the device size and resolution is detected upon startup.

We needed to make sure un-used assets were not loaded in memory. We had to find out a little late in the project that any asset that was only referenced by an instance of a prefab and never instantiated was fully loaded in memory.

Stripping out extra polygons from meshes helped.

We needed to re-architect the lifecycle management of some assets a few times. For example tweaking the load/unload time for the main menu assets or end of level assets or game music.

Each level needed to have its specific object pool tailored to its dynamic object requirements and optimized for the least memory needs. Object pools can be flexible and contain a lot of objects during development, however, they need to be specific once the game object requirements are known.

Keeping the sound files compressed in memory was necessary.

Game performance enhancement is a long and challenging journey and we had a fun time experiencing a small part of this voyage. The vast amount of knowledge shared by the game development community and very good profiling tools provided by Unity were what made us reach our performance targets for Shadow Blade.(source:gamasutra


發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章