Search Unity

  1. Megacity Metro Demo now available. Download now.
    Dismiss Notice
  2. Unity support for visionOS is now available. Learn more in our blog post.
    Dismiss Notice

Translating thousands of gameobjects

Discussion in 'Scripting' started by SukhvinderSingh, Jun 12, 2017.

  1. SukhvinderSingh

    SukhvinderSingh

    Joined:
    Jul 18, 2016
    Posts:
    55
    I have written a script that spawns thousands of game objects at a specified frequency and translates them towards the player (they are destroyed after going into a destroy zone).

    The code to translate objects is this:
    Code (CSharp):
    1. using System.Collections;
    2. using System.Collections.Generic;
    3. using UnityEngine;
    4.  
    5. public class ObjectMove : MonoBehaviour {
    6.  
    7.     public float speed= 20f;
    8.     public bool rot;
    9.     public float DegPerSecond;
    10.     float dir = -1;
    11.  
    12.     // Use this for initialization
    13.     void Start () {
    14.        
    15.     }
    16.    
    17.     // Update is called once per frame
    18.     void Update () {
    19.         transform.Translate (new Vector3 (0, 0, speed * dir * Time.deltaTime),Space.World);
    20.         if (rot)
    21.             transform.Rotate (0, Time.deltaTime*DegPerSecond, 0,Space.Self);
    22.        
    23.     }  
    24. }
    25.  
    Now actually this script is attached to a prefab and my spawner script spawns thousands of prefabs in the scene. This means since the prefabs already have this code attached, the moment they are instantiated, they start moving towards the player.

    The problem I am getting is high CPU usage by this script. I ran a deep profile and found that 70% of the CPU is utilized by this script. Also, time in ms used is pretty large (22 ms) by the update function of this script alone.

    Is there any alternative to moving (translate and rotate) thousands of instantiated prefabs in the scene with less CPU overhead?
     
  2. ShilohGames

    ShilohGames

    Joined:
    Mar 24, 2014
    Posts:
    3,015
    Yes, the best alternative is an instancing pool, which is similar to an object pool but does not have all of those game objects. With an instancing pool, you use an array of a custom struct to track where your objects would be if they were gameobjects, but you don't have the overhead of gameobjects. Then you set up your instancing pool to translate your array of your struct into an arrary of Matrix4x4, so you can feed that information into the DrawMeshInstanced API method. That will reduce the draw calls to one call per DrawMeshInstanced call instead of one draw call per game object. This is how to handle all of the lasers in my space game.
     
  3. SukhvinderSingh

    SukhvinderSingh

    Joined:
    Jul 18, 2016
    Posts:
    55
    I am not getting large draw calls, but the problem is with the script itself. This script is using most of the CPU and reducing performance, as the profiler suggests. I want to translate objects and translation alone is taking huge amount of CPU
     
  4. neginfinity

    neginfinity

    Joined:
    Jan 27, 2013
    Posts:
    13,553
    Can you post a screenshot of a deep profile window?
     
  5. GarBenjamin

    GarBenjamin

    Joined:
    Dec 26, 2013
    Posts:
    7,441
    Well how many thousands are you talking about? I'd guess that yes updating say 10,000 objects this way using the callback might well produce a performance impact. You can test using a manager. Kind of a one ring to rule them all.

    A single manager receives the callback to its Update method and it in turn loops through all objects translating and rotating them as needed. Should be much faster.

    But yeah best to probably post the profiler shots so people can see if they can solve it as is.
     
    Martin_H likes this.
  6. SukhvinderSingh

    SukhvinderSingh

    Joined:
    Jul 18, 2016
    Posts:
    55
  7. neoshaman

    neoshaman

    Joined:
    Feb 11, 2011
    Posts:
    6,492
    I think there was a unite video about teh cost of having many Transform into a scene, I bet the hit is from the transform.
    I'll post the video if I can track back the sequence.
     
  8. neoshaman

    neoshaman

    Joined:
    Feb 11, 2011
    Posts:
    6,492
    at 28:40
     
  9. ShilohGames

    ShilohGames

    Joined:
    Mar 24, 2014
    Posts:
    3,015
    Using an instancing pool like I described will drastically reduce your CPU and memory usage in addition to reducing draw calls. If you want to deliver hundreds of frames per second with thousands of projectiles onscreen, then you will eventually want to develop an instancing pool.

    If you want to try to use a object pool instead of an instancing pool, there is one more trick you can try. Rename the Update function in the script that is attached to all of those gameobjects. For example, change Update to ManualUpdate. That way there is no calling overhead between the Unity engine and the C# code. Then after you have done that, add code to your object pool code to call the ManualUpdate in each of those gameobjects during the object pool's Update function. NOTE: This will help reduce the CPU usage, but not nearly as much as an instancing pool will do.

    Also, if you want to use an object pool, don't destroy and recreate objects. Just disable/enable pooled objects. That will save you CPU time as well.
     
  10. neginfinity

    neginfinity

    Joined:
    Jan 27, 2013
    Posts:
    13,553
  11. neoshaman

    neoshaman

    Joined:
    Feb 11, 2011
    Posts:
    6,492
    This video is also useful
     
  12. SukhvinderSingh

    SukhvinderSingh

    Joined:
    Jul 18, 2016
    Posts:
    55
    Thanks :)
     
    neoshaman likes this.
  13. SiliconDroid

    SiliconDroid

    Joined:
    Feb 20, 2017
    Posts:
    302
    RE Optimizing the code you posted.

    Having an Update override per object has overhead that becomes significant in the thousands, certainly on weaker platforms.

    As @ShilohGames mentions: Cheaper if you store all objects in array or list (or child to some root object) and then enumerate all of them calling a DoUpdate func for each.

    Also don't do any new alloc within any per frame function scope if you can help it.

    For temp vars that exist for life of function you can declare them at class scope, then at runtime they're ready to use without any alloc. Also as were in single thread you can declare them static (one instance per process) to preserve mem.

    Code (CSharp):
    1.         static Vector3 DoUpdate_v3 = new Vector3(0, 0, 0);
    2.         void DoUpdate()
    3.         {
    4.             DoUpdate_v3.z = speed * dir * Time.deltaTime;
    5.             transform.Translate(DoUpdate_v3, Space.World);
    6.             if (rot)
    7.             {
    8.                 transform.Rotate(0, Time.deltaTime * DegPerSecond, 0, Space.Self);
    9.             }
    10.         }
    11.  
    In this case don't need vector though:

    Code (CSharp):
    1.  
    2.         void DoUpdate()
    3.         {
    4.             transform.Translate(0, 0, speed * dir * Time.deltaTime, Space.World);
    5.             if (rot)
    6.             {
    7.                 transform.Rotate(0, Time.deltaTime * DegPerSecond, 0, Space.Self);
    8.             }
    9.         }
    10.  
    I struggle on mobile VR to have alot of stuff happening and holding 60FPS, one trick I also use is modulo clocking. Simple case: on odd frames clock odd array indexes, even frames clock even. This gives a 30Hz clock rate, which is acceptable for alot of stuff even if it's visibly moving or animated.

    Also when you clock like this you can have your DoUpdate func return a bool, if false it means I don't need clocking again and the clock manager can remove it from the list. Useful for things you want to keep pooled quietly when not being used. Yes this can be done with co routines, but again; they have noticeable overhead and GC alloc.

    To implement a clock manager I would advise making all your scripts inherit MY_MonoBehavior, where MY_MonoBehavior is your own wrapper around (inherits) MonoBehavior to which you add a virtual function "DoUpdate". You override that DoUpdate in the final use case derived class. This way you can keep references to all active clockables in a List<MY_MonoBehavior> and call all their DoUpdates quickly.
     
    Last edited: Jun 14, 2017
    SukhvinderSingh likes this.
  14. Brathnann

    Brathnann

    Joined:
    Aug 12, 2014
    Posts:
    7,186
    SukhvinderSingh likes this.
  15. Korindian

    Korindian

    Joined:
    Jun 25, 2013
    Posts:
    584
    @ShilohGames Thanks for posting the instanced pool solution. I'm curious about how you manage projectiles which are not in use... since they're in an array, are you still iterating over non-active projectiles but not doing any logic in the manual update through a bool check? Thanks.
     
  16. ShilohGames

    ShilohGames

    Joined:
    Mar 24, 2014
    Posts:
    3,015
    In a standard object pool, the pooled objects are disabled when they are not in use. In an instanced pool, there are no objects to disable. There is an array of a custom struct that holds the information about where the projectiles are, which direction they are moving, whether the projectile is active, etc. But the array of the custom struct is not tied to any individual game objects in the scene. The array is literally just data, and the instanced pool class has to act on that data every frame (move stuff, check for collisions, etc) and then pass the information into an array of Matrix4x4 to send into the DrawMeshInstanced API method every frame.

    For example, here is a code snippet from a singleton class that manages one type of instanced pool. The complete class has another couple hundred lines of code. I am posting this snippet to help explain the data model within the class.

    Code (CSharp):
    1.     public struct LaserProjectile
    2.     {
    3.         public Vector3 currentPosition;
    4.         public Vector3 oldPosition;
    5.         public Quaternion rotation;
    6.         public Vector3 movementVector;
    7.         public bool active;
    8.         public GameObject FiredBy;
    9.         public float timeToExpire;
    10.     } // end struct
    11.  
    12.     private LaserProjectile[] ProjectileArray;
    13.     private int projectileArraySize = 1500;
    14.  
    15.     private bool boolReceiveShadows = false;
    16.     private MaterialPropertyBlock materialProperty;
    17.     private Matrix4x4[] MatrixArray;
    18.     private int matrixSize = 500;
    19.  
    The Update function in the instanced pool handles the changes to data and populating and submitting the draw information through DrawMeshInstanced. In the code example I posted above, the projectile array is longer than the Matrix4x4 array, so the Update function needs to do multiple submissions to the drawing API. For performance reasons, the same Matix4x4 array is re-used each time. Only items marked as active in the projectile array are submitted to the drawing API.

    Each type of projectile gets its own instanced pool, so things like damage per projectile and linear speed are defined at the class level instead of per projectile. In my current game, there are dozens of instanced pools set up this way and then attached to a single object in the scene.

    And there is a public function called FireWeapon that lets ships and turrets add another projectile to the array. The added projectile is placed in the first inactive item within the custom struct array (ProjectileArray in my case).
     
    Last edited: Jul 6, 2017
    SiliconDroid, Korindian and Kiwasi like this.
  17. Korindian

    Korindian

    Joined:
    Jun 25, 2013
    Posts:
    584
    @ShilohGames Thanks for the detailed explanation. I don't think I would have ever conceived of pooling a lot of projectiles in this way, so I appreciate you sharing this.
     
  18. benbenmushi

    benbenmushi

    Joined:
    Jan 29, 2013
    Posts:
    34
    this.transform actually calls GetComponent<Transform>() so you could also store your own transform in a script variable at Start() and use it in your Update()
     
  19. Baste

    Baste

    Joined:
    Jan 24, 2013
    Posts:
    6,294
    Nope. It doesn't call GetComponent<Transform>() at all. It does ask for the transform from the c++ engine, though, so caching it is faster. Decompiled, it looks like this:

    Code (csharp):
    1. public extern Transform transform
    2. {
    3.     [GeneratedByOldBindingsGenerator]
    4.     [MethodImpl(MethodImplOptions.InternalCall)]
    5.     get;
    6. }
    A final optimization: from 5.6 and forward, Unity's got a Transform.SetPositionAndRotation method. It's faster than setting the position and rotation at the same time, as both of those triggers a re-caluclation of the same Matix4x4 . Of course, if you're doing the smart thing @ShilohGames is recommending, it won't be relevant.
     
    SiliconDroid likes this.
  20. SiliconDroid

    SiliconDroid

    Joined:
    Feb 20, 2017
    Posts:
    302
    Thank you @ShilohGames,
    Discovered DrawMeshInstanced and now understand the concept of using it thanks to your post.

    I will try throwing some stuff around on mobile using this, see how it goes, I'm guessing for small things it's going to be fast, bigger things may cause lots of overdraw because no Z sorting:

    Wonder if the Matrix4x4[] can be managed in script to reduce overdraw and do frustum culling. I'm guessing instance render order is up from [0] through the array?

    EDIT: Disputed Space is EPIC! DrawMeshInstaced is giving you a crazy amount of weapon action!!:cool:
     
    Last edited: Jul 6, 2017
  21. ShilohGames

    ShilohGames

    Joined:
    Mar 24, 2014
    Posts:
    3,015
    Thank you. Yeah, DrawMeshInstanced is a real gamer changer.

    You could manually implement culling into your instancing pool if you run into a situation where you have so many units that the cost of setting up the culling is less than the cost of rendering all of the meshes. In my case, I have support for up to 4 players in local split screen, so culling would be very tricky. My projectiles are rendered to all cameras at once instead of manually to each camera. I don't do any manual culling within the instanced pools, and the frame rates are great, but there could be situations where manually culling some units would be advantageous.
     
    Last edited: Mar 25, 2019
    SiliconDroid likes this.