移動平臺unity3d優化

Just like on PCs, mobile platforms like iOS and Android have devices of various levels of performance. You can easily find a phone that's 10x more powerful for rendering than some other phone. Quite easy way of scaling:

和PC電腦類似,像iOS和安卓這樣的移動平臺有很多不同等級性能的設備。你可以很容易找到一個手機,它的渲染能力是另一個手機的十幾倍。縮放性能的方法有很多:

  1. Make sure it runs okay on baseline configuration 
    確保它可以在基準配置上運行
  2. Use more eye-candy on higher performing configurations:
    在性能更高的配置上使用一些養眼的東西:
    • Resolution 分辨率
    • Post-processing 後處理效果
    • MSAA 多重採樣抗鋸齒
    • Anisotropy 各向異性
    • Shaders 着色器
    • Fx/particles density, on/off 
      特效/粒子密度,開啓/關閉 着眼於GPUs Focus on GPUs

Focus on GPUs
着眼於GPU

Graphics performance is bound by fillrate, pixel and geometric complexity (vertex count). All three of these can be reduced if you can find a way to cull more renderers. Occlusion culling and could help here. Unity will automatically cull objects outside the viewing frustum.

充填率、像素以及幾何體的複雜度(頂點數量)綁定了圖形性能。如果你可以找到一個方法來剔除更多的渲染,那麼這三點就可以被降低。遮擋剔除就可以做到這一點。Unity會自動剔除視見平截頭體之外的對象。

On mobiles you're essentially fillrate bound (fillrate = screen pixels * shader complexity * overdraw), and over-complex shaders is the most common cause of problems. So use mobile shaders that come with Unity or design your own but make them as simple as possible. If possible simplify your pixel shaders by moving code to vertex shader.

在移動設備上,你應該對充填率綁定(充填率=屏幕像素*着色器複雜度*透支)很敏感,而且過度複雜的着色器是最常見的引發問題的起因。因此,請使用Unity自帶的移動平臺着色器或者設計你自己的着色器但是使它們儘可能簡單。如果可能,爲了簡化你的像素着色器,把代碼移動到頂點着色器中。

If reducing the Texture Quality in Quality Settings makes the game run faster, you are probably limited by memory bandwidth. So compress textures, use mipmaps, reduce texture size, etc.

如果在Quality Settings裏降低Texture Quality的值來使得遊戲運行的更流暢,你可能會被內存帶寬所限制。因此請壓縮紋理,使用mipmaps,減少紋理大小等等。

LOD (Level of Detail) ' make objects simpler or eliminate them completely as they move further away. The main goal would be to reduce the number of draw calls.

LOD(細節等級)使得對象更加簡單,或是在它們移向遠方時完全消除它們。主要目的都是爲了減少繪製調用的數目。

Good practice 優秀的實踐

Mobile GPUs have huge constraints in how much heat they produce, how much power they use, and how large or noisy they can be. So compared to the desktop parts, mobile GPUs have way less bandwidth, low ALU performance and texturing power. The architectures of the GPUs are also tuned to use as little bandwidth & power as possible.

對於移動平臺的GPUs,它們產生了多少熱量、它們使用了多少能量以及它們多大或者多吵都是有很大限制的。因此,和臺式電腦相比,移動平臺GPUs具有更少的帶寬、低下的ALU性能和紋理功能。GPUs的體系結構也同樣被調整爲儘可能少的使用帶寬和能量。

Unity is optimized for OpenGL ES 2.0, it uses GLSL ES (similar to HLSL) shading language. Built in shaders are most often written in HLSL (also known as Cg). This is cross compiled into GLSL ES for mobile platforms. You can also write GLSL directly if you want to, but doing that limits you to OpenGL-like platforms (e.g. mobile + Mac) since there currently are no GLSL->HLSL translation tools. When you use float/half/fixed types in HLSL, they end up highp/mediump/lowp precision qualifiers in GLSL ES.

Unity優化了OpenGL ES 2.0,它使用GLSL ES(和HLSL類似)着色語言。內置的着色器大部分都是使用HLSL(也被稱爲Cg)編寫的。對於移動平臺,這被交叉編譯爲GLSL ES。如果你想,你也可以直接使用GLSL,但是這樣做會限制你發佈在OpenGL的平臺(例如移動平臺+Mac),因爲目前沒有GLSL到HLSL的轉換工具。當你在HLSL中使用float/half/fixed 類型時,在GLSL ES中它們是由highp/mediump/lowp前置標識符結束的。

Here is the checklist for good practice:

下面的清單列出了一些優秀的實踐:

  1. Keep the number of materials as low as possible. This makes it easier for Unity to batch stuff. 
    保持材質的數目儘可能少。這使得Unity更容易進行批處理。
  2. Use texture atlases (large images containing a collection of sub-images) instead of a number of individual textures. These are faster to load, have fewer state switches, and are batching friendly. 
    使用紋理精靈(一張大貼圖裏包含了很多子貼圖)來代替一系列單獨的小貼圖。它們可以更快地被加載,具有很少的狀態轉換,而且批處理更友好。
  3. Use Renderer.sharedMaterial instead of Renderer.material if using texture atlases and shared materials. 
    如果使用了紋理精靈和共享材質,使用Renderer.sharedMaterial 來代替Renderer.material 。
  4. Forward rendered pixel lights are expensive.
    像素燈光提前渲染的代價是昂貴的。
    • Use light mapping instead of realtime lights where ever possible. 
      儘可能使用燈光映射來代替實時燈光。
    • Adjust pixel light count in quality settings. Essentially only the directional light should be per pixel, everything else - per vertex. Certainly this depends on the game. 
      在質量設置中調整像素燈光的數量。只有平行光應該是逐像素的,其他所有都應該是逐頂點的。當然,這取決於遊戲。
  5. Experiment with Render Mode of Lights in the Quality Settings to get the correct priority. 
    反覆調整Quality Settings中的Render Mode of Lights來得到正確的優先級。
  6. Avoid Cutout (alpha test) shaders unless really necessary. 
    避免Cutout(透明度測試)着色器,除非是真的需要。
  7. Keep Transparent (alpha blend) screen coverage to a minimum. 
    保持透明(透明度混合)屏幕覆蓋範圍最小。
  8. Try to avoid situations where multiple lights illuminate any given object. 
    嘗試避免多個燈光照亮任何給定對象的情況。
  9. Try to reduce the overall number of shader passes (Shadows, pixel lights, reflections). 
    嘗試減少着色通道(陰影,像素燈光,反射)的全部數量。
  10. Rendering order is critical. In general case:
    渲染順序是非常重要的。通常情況下:
    1. fully opaque objects roughly front-to-back. 
      大致從前往後的完全不透明對象。
    2. alpha tested objects roughly front-to-back. 
      大致是從前往後的透明度測試的對象。
    3. skybox. 天空盒子
    4. alpha blended objects (back to front if needed). 
      透明度混合對象(如果需要就從後往前)
  11. Post Processing is expensive on mobiles, use with care. 
    後期處理在移動平臺上是代價昂貴的,請小心使用。
  12. Particles: reduce overdraw, use the simplest possible shaders. 
    例子系統:降低透支,使用儘可能簡單的着色器。
  13. Double buffer for Meshes modified every frame: 
    對於每一幀都需要修改的網格使用雙緩存:
<code style="color: rgb(0, 0, 0);"><span class="kwd" style="color: rgb(0, 0, 136);">void</span><span class="pln"> </span><span class="typ" style="color: rgb(102, 0, 102);">Update</span><span class="pln"> </span><span class="pun" style="color: rgb(102, 102, 0);">(){</span><span class="pln">
  </span><span class="com" style="color: rgb(136, 0, 0);">// flip between meshes</span><span class="pln">
  bufferMesh </span><span class="pun" style="color: rgb(102, 102, 0);">=</span><span class="pln"> on </span><span class="pun" style="color: rgb(102, 102, 0);">?</span><span class="pln"> meshA </span><span class="pun" style="color: rgb(102, 102, 0);">:</span><span class="pln"> meshB</span><span class="pun" style="color: rgb(102, 102, 0);">;</span><span class="pln">
  on </span><span class="pun" style="color: rgb(102, 102, 0);">=</span><span class="pln"> </span><span class="pun" style="color: rgb(102, 102, 0);">!</span><span class="pln">on</span><span class="pun" style="color: rgb(102, 102, 0);">;</span><span class="pln">
  bufferMesh</span><span class="pun" style="color: rgb(102, 102, 0);">.</span><span class="pln">vertices </span><span class="pun" style="color: rgb(102, 102, 0);">=</span><span class="pln"> vertices</span><span class="pun" style="color: rgb(102, 102, 0);">;</span><span class="pln"> </span><span class="com" style="color: rgb(136, 0, 0);">// modification to mesh</span><span class="pln">
  meshFilter</span><span class="pun" style="color: rgb(102, 102, 0);">.</span><span class="pln">sharedMesh </span><span class="pun" style="color: rgb(102, 102, 0);">=</span><span class="pln"> bufferMesh</span><span class="pun" style="color: rgb(102, 102, 0);">;</span><span class="pln">
</span><span class="pun" style="color: rgb(102, 102, 0);">}</span></code>

Sharer optimizations 着色器優化

Checking if you are fillrate-bound is easy: does the game run faster if you decrease the display resolution? If yes, you are limited by fillrate.

檢查你是否是充填率綁定的是容易的:如果你降低顯示分辨率,遊戲是否運行的更流暢?如果是,那麼你就是被充填率限制了。

Try reducing shader complexity by the following methods:

嘗試使用下面的方法減小着色器複雜度:

  • Avoid alpha-testing shaders; instead use alpha-blended versions. 
    避免透明度測試着色器;使用透明度混合的版本來代替。
  • Use simple, optimized shader code (such as the 'Mobile' shaders that ship with Unity). 
    使用簡單的、優化的着色器代碼(例如Unity自帶的移動平臺的着色器)。
  • Avoid expensive math functions in shader code (pow, exp, log, cos, sin, tan, etc). Consider using pre-calculated lookup textures instead. 
    避免在着色器代碼裏使用高昂的數學函數(pow, exp, log, cos, sin, tan等等)。考慮使用預計算的查表貼圖來代替。
  • Pick lowest possible number precision format (float, half, fixedin Cg) for best performance. 
    爲了得到最高性能,選擇最低可能的精度數目格式(Cg中是float, half, fixed)。

Focus on CPUs
着眼於CPUs

It is often the case that games are limited by the GPU on pixel processing. So they end up having unused CPU power, especially on multicore mobile CPUs. So it is often sensible to pull some work off the GPU and put it onto the CPU instead (Unity does all of these): mesh skinning, batching of small objects, particle geometry updates.

遊戲在像素處理時被GPU所限制,是非常常見的。它們的CPU能力就沒有被使用,特別是在多核的移動平臺的CPUs。因此,將一些工作從GPU裏拉出來,放到CPU裏進行(Unity做了這些所有的事情)通常是明智的:網格蒙皮,小對象的批處理,粒子幾何體更新。

These should be used with care, not blindly. If you are not bound by draw calls, then batching is actually worse for performance, as it makes culling less efficient and makes more objects affected by lights!

這些應該小心使用,而不是盲目使用。如果你不是被繪製調用所限制,那麼批處理實際上使得性能更加糟糕,因爲它減少了剔除的效率,並使得更多的對象受燈光影響!

Good practice 優秀的實踐

  • Don't use more than a few hundred draw calls per frame on mobiles. 
    在移動設備上,每幀不要使用超過幾百的繪製調用。
  • FindObjectsOfType (and Unity getter properties in general) are very slow, so use them sensibly. 
    FindObjectsOfType(和Unity其他常見的getter屬性)是非常慢的,因此聰明地使用它們。
  • Set the Static property on non-moving objects to allow internal optimizations like static batching. 
    將非移動對象設置爲Static屬性來允許內置優化,例如靜態批處理。
  • Spend lots of CPU cycles to do occlusion culling and better sorting (to take advantage of Early Z-cull). 
    花費大量的CPU循環來進行遮擋剔除和更好的排序(利用Early Z-cull)

Physics 物理

Physics can be CPU heavy. It can be profiled via the Editor profiler. If Physics appears to take too much time on CPU:

物理是非常消耗CPU的。它可以通過編輯器分析器被優化。如果物理模擬看起來花費了過多的CPU時間:

  • Tweak Time.fixedDeltaTime (in Project settings -> Time) to be as high as you can get away with. If your game is slow moving, you probably need less fixed updates than games with fast action. Fast paced games will need more frequent calculations, and thus fixedDeltaTime will need to be lower or a collision may fail. 
    把Time.fixedDeltaTime (在Project settings -> Time)的值調整爲你可以接受的最高值。如果你的遊戲移動很慢,相對於那麼快速動作的遊戲,你可能需要更小的固定更新。快速步調的遊戲將需要更頻繁的計算,因此 fixedDeltaTime 需要降低,否則碰撞可能會失敗。
  • Physics.solverIterationCount (Physics Manager). 
    Physics.solverIterationCount(物理管理器)
  • Use as little Cloth objects as possible. 
    使用儘可能少的Cloth對象。
  • Use Rigidbodies only where necessary. 
    只在必需時使用剛體。
  • Use primitive colliders in preference mesh colliders. 
    相比於網格碰撞器,優先使用原型碰撞器。
  • Never ever move a static collider (ie a collider without a Rigidbody) as it causes a big performance hit.
    永遠不要移動一個靜態碰撞器(例如一個沒有剛體的碰撞器),因爲這會導致很大的性能損失。
    • Shows up in Profiler as 'Static Collider.Move' but actual processing is in Physics.Simulate
      在分析器裏顯示爲Static Collider.Move,但是實際上是在Physics.Simulate裏處理的。
    • If necessary, add a RigidBody and set isKinematic to true. 
      如果必需,添加一個剛體,並選中它的isKinematic 。
  • On Windows you can use NVidia's AgPerfMon profiling tool set to get more details if needed. 
    如果需要,在Windows上你可以使用英偉達的AgPerfMon分析工具集合來得到更多細節

Android

GPU

These are the popular mobile architectures. This is both different hardware vendors than in PC/console space, and very different GPU architectures than the 'usual' GPUs.

下面是一些流行的移動平臺體系架構。它相比於PC/控制檯空間具有不同的硬件供應商,以及與通常的GPUs相比非常不同的GPU體系架構。

  • ImgTec PowerVR SGX - Tile based, deferred: render everything in small tiles (as 16x16), shade only visible pixels 
    ImgTec PowerVR SGX – 基於平鋪的,延遲的:在小單元(例如16*16)裏渲染東西,只對可見像素着色
  • NVIDIA Tegra - Classic: Render everything 
    英偉達圖睿 – 典型的:渲染所有東西
  • Qualcomm Adreno - Tiled: Render everything in tile, engineered in large tiles (as 256k). Adreno 3xx can switch to traditional. 
    高通Adreno – 平鋪的:. 在單元裏渲染所有東西,在大單元(例如256k)里加強。Adreno.3xx可以切換到傳統模式
  • ARM Mali Tiled: Render everything in tile, engineered in small tiles (as 16x16) 
    ARM Mali Tiled:在單元裏渲染所有東西,在小單元(例如16*16)里加強

Spend some time looking into different rendering approaches and design your game accordingly. Pay especial attention to sorting. Define the lowest end supported devices early in the dev cycle. Test on them with the profiler on as you design your game.

花一些時間來深入瞭解不同的渲染方法,並相應地設計你的遊戲。尤其需要注意排序。在開發過程的前期定義好支持最低終端設備。使用分析器測試你設計你的遊戲運行在的平臺設備。

Use platform specific texture compression.

使用平臺特定的紋理壓縮。

Further reading 擴展閱讀

Screen resolution 屏幕分辨率

Android version 安卓版本

 iOS

GPU

Only PowerVR architecture (tile based deferred) to be concerned about.

只需要考慮PowerVR體系結構(基於平鋪延遲的)。

  • ImgTec PowerVR SGX. Tile based, deferred: render everything in tiles, shade only visible pixels 
    基於平鋪延遲的:在單元裏渲染所有東西,只對可見的像素着色。
  • ImgTec .PowerVR MBX. Tile based, deferred, fixed function - pre iPhone 4/iPad 1 devices 
    基於平鋪延遲的,固定編程的 - iPhone 4/iPad 1之前的設備

This means: 這意味着:

  • Mipmaps are not so necessary. 
    Mipmaps不是那麼必需的。
  • Antialiasing and aniso are cheap enough, not needed on iPad 3 in some cases 
    反鋸齒和反向異性是足夠簡單的,在某些情況下不需要在iPad 3上。

And cons: 以及缺點:

  • If vertex data per frame (number of vertices * storage required after vertex shader) exceeds the internal buffers allocated by the driver, the scene has to be 'split' which costs performance. The driver might allocate a larger buffer after this point, or you might need to reduce your vertex count. This becomes apparent on iPad2 (iOS 4.3) at around 100 thousand vertices with quite complex shaders. 
    如果每幀的頂點數據(在頂點着色之後所需的頂點*空間的數目)超過了驅動分配的內部緩存,屏幕將不得不進行分屏,這將消耗性能。在這點之後,驅動可能會分配一個更大的緩存,或者你可能需要降低你的頂點數量。這在iPad2 (iOS 4.3)變爲是編程透明的,即在一個相當複雜的着色器中大約100,000個頂點。
  • TBDR needs more transistors allocated for the tiling and deferred parts, leaving conceptually less transistors for 'raw performance'. It's very hard (i.e. practically impossible) to get GPU timing for a draw call on TBDR, making profiling hard. 
    TBDR需要分配更多的晶體管來進行覆蓋,而且是部分延遲的,理論上留給原生性能更少的晶體管。在TBDR上得到GPU的一個繪製調用時間是非常困難的,這使得分析變得困難。

Further reading 擴展閱讀

Screen resolution 屏幕分辨率

iOS version (iOS版本 )

Dynamic Objects 動態對象

Asset Bundles 資源包

  • Asset Bundles are cached on a device to a certain limit 
    在某種程度內資源包可以被緩存在設備上
  • Create using the Editor API 
    使用編輯器API來創建
  • Load 加載
    • Using WWW API: WWW.LoadFromCacheOrDownload 
      使用WWW API:WWW.LoadFromCacheOrDownload
    • As a resource: AssetBundle.CreateFromMemory or AssetBundle.CreateFromFile 
      最後一個方法:AssetBundle.CreateFromMemory或AssetBundle.CreateFromFile
  • Unload 卸載
    • AssetBundle.Unload
      • There is an option to unload the bundle, but keep the loaded asset from it 
        你可以選擇來卸載一個包,但是保留從它加載的資源
      • Also can kill all the loaded assets even if they're referenced in the scene 
        也可以關閉所有加載的資源,甚至當它們已經在場景中被引用時
    • Resources.UnloadUnusedAssets
      • Unloads all assets no longer referenced in the scene. So remember to kill references to the assets you don't need. 
        卸載所有的不在場景中被引用的資源。因此,請記住要關閉你不需要的資源的引用。
      • Public and static variables are never garbage collected. 
        公有的和靜態的變量是永遠不會被垃圾回收的。
    • Resources.UnloadAsset
      • Unloads a specific asset from memory. It can be reloaded from disk if needed. 
        從內存中卸載一個特定的資源。如果需要,它可以再次從硬盤上加載進來。

Is there any limitation for download numbers of Assetbundle at the same time on iOS? (e.g Can we download over 10 assetbundles safely at the same time(or every frame)? )
在iOS上,在同一時間加載的Assetbundle的數目有限制嗎?(例如,我們可以在同一時間(或者每一幀)安全地加載超過10個assetbundle嗎?)

Downloads are implemented via async API provided by OS, so OS decides how many threads need to be created for downloads. When launching multiple concurrent downloads you should keep in mind total device bandwidth it can support and amount of free memory. Each concurrent download allocates its own temporal buffer, so you should be careful there to not run out of memory.

通過OS提供的異步API實現加載,因此OS決定加載時需要創建多少線程。當開啓多個併發的加載時,你應該時刻記住設備支持的所有帶寬以及空餘內存的數量。每一個併發的加載分配它自己臨時的緩存,因此在這裏你應該小心不要超出內存。

Resources 資源

  • Assets need to be recognized by Unity to be placed in a build. 
    在發佈過程中,Unity需要識別資源來放置它們。
  • Add .bytes file extension to any raw bytes you want Unity to recognize as a binary data. 
    將.bytes文件擴展作爲一個二進制數據添加到你想要Unity識別的任何原始字節。
  • Add .txt file extension to any text files you want Unity to recognize as a text asset 
    將.txt文件擴展作爲一個文本資源添加到你想要Unity識別的任何文本文件。
  • Resources are converted to a platform format at a build time. 
    在發佈的時候,資源將會被轉換爲特定的平臺格式。
  • Resources.Load()

Silly issues checklist 不應該做的事情的清單

  • Textures without proper compression 沒有經過合適壓縮的紋理
    • Different solutions for different cases, but be sure to compress textures unless you're sure you should not. 
      不同的情況下有不同的分辨率,但是確保壓縮你的紋理除非你肯定你不應該。
    • ETC/RGBA16 - default for android 
      安卓的默認模式
      • but can tweak depending on the GPU vendor 
        但是會根據GPU供應商而改變
      • best approach is to use ETC where possible 
        最好的方法是儘可能使用ETC
      • alpha textures can use two ETC files with one channel being for alpha 
        透明紋理可以使用兩個ETC文件,其中一個通道用於透明度
    • PVRTC - default for iOS (iOS的默認模式)
      • good for most cases 在大多數情況是好的
  • Textures having Get/Set pixels enabled - doubles the footprint, uncheck unless Get/Set is needed 
    開啓了Get/Set像素的紋理 – 加倍了封裝,除非需要,否則不要選中Get/Set
  • Textures loaded from JPEG/PNGs on the runtime will be uncompressed 
    動態加載的JPEG/PNGs紋理將會被壓縮
  • Big mp3 files marked as decompress on load 
    大型mp3文件在加載時被標記爲未壓縮的
  • Additive scene loading 附加的場景加載
  • Unused Assets that remain uncleaned in memory 內存中保留了沒有被清理的未使用的Assets
    • Static fields 靜態區域
    • not unloaded asset bundles 未加載的資源包
  • If it randomly crashes, try on a devkit or a device with 2 GB memory (like Ipad 3). 
    如果它隨機崩潰,嘗試在一個開發工具或一個具有2GB(例如Ipad 3)的設備上運行。

Sometimes there's nothing in the console, just a random crash

有時控制檯裏沒有發生任何狀況,而僅僅是一個隨機崩潰。

  • Fast script call and stripping may lead to random crashes on iOS. Try without them. 
    在iOS上,快速腳本調用和代碼剝離可能會導致崩潰。儘量不要使用它們。
1
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章