Unity Community


Results 1 to 10 of 10

  1. Location
    Paris
    Posts
    3,730

    The struct / static typing performance power-up

    Hey,

    Just wanted to share my experience in great performance increase :

    I'm actually reworking my code to optimize the hell out of it.

    2 things I'm doing so far :

    • * grouping all my class datas into struct blocks.


    For example, before I got :

    Code:  
    1. public class MyClass : MonoBehaviour {
    2.  
    3.  string _objectName;
    4.  int _objectNumber;
    5.  AnotherClass _objectClass;
    6.  
    7.  
    8. }

    Now I got :
    Code:  
    1. public struct _objectStruct {
    2.  
    3.  public string _name;
    4.  public int _number;
    5.  public AnotherClass _class;
    6.  
    7. }
    8.  
    9. public class MyClass : MonoBehaviour {
    10.  
    11.  private _objectStruct _object;
    12.  
    13.  
    14. }

    Doing this for everything that could be grouped into a struct, separating different structs from their job. This helps to see things much clearer.
    And according to MSDN, it is better performance (I'll illustrate this later).

    • * replacing all my Classes instances by a single static Superclass.


    Before I got :

    Code:  
    1. public class PositionManager : MonoBehaviour {
    2.  
    3. //yadda yadda
    4.  
    5. }
    6.  
    7. public class ActionManager : MonoBehaviour {
    8.  
    9. //yadda yadda
    10.  
    11. }
    12.  
    13. public class MainClass : MonoBehaviour {
    14.  
    15.  private ActionManager _actionManager;
    16.  private PositionManager _positionManager;
    17.  
    18.  
    19.  void Awake() {
    20.  
    21.   _actionManager = (ActionManager) gameObject.GetComponent       (typeof(ActionManager));
    22.  
    23.   _positionManager = (positionManager) gameObject.GetComponent (typeof(positionManager));
    24.  
    25.  }
    26.  
    27.  void Main() {
    28.  
    29. // stuff using _actionManager & _positionManager
    30.  
    31.  }
    32.  
    33.  
    34. }
    35.  
    36. public class OtherClass : MonoBehaviour {
    37.  
    38.  private PositionManager _positionManager;
    39.  
    40.  void Awake() {
    41.  
    42.   _positionManager = (positionManager) gameObject.GetComponent (typeof(positionManager));
    43.  
    44.  }
    45.  
    46.  void Main() {
    47.  
    48. // stuff using _positionManager
    49.  
    50.  }
    51.  
    52.  
    53. }

    And this kind of code was repeated as many times as there were characters on screen.

    Now I got :

    Code:  
    1. public class Superclass : MonoBehaviour {
    2.  
    3.  static public PositionManager[] _positionManagers;
    4.  static public ActionManager[] _actionManagers;
    5.  
    6.  static private string[] _charNames;
    7.  
    8.  void Awake () {
    9.  
    10.   int _numberOfCharacters = 2;
    11.   _charNames = new string[]{"Player1", "Player2"};
    12.  
    13.   _positionManagers = new PositionManager[_numberOfCharacters];
    14.   _actionManagers = new ActionManager[_numberOfCharacters];
    15.  
    16.  }
    17.  
    18.  
    19.  static public void Create_ActionManager(int _char) {
    20.  
    21.   _actionManagers[_char] =   GameObject.Find(_charNames[_char]).AddComponent(typeof(ActionManager)) as ActionManager;
    22.  
    23.  }
    24.  
    25.  static public ActionManager Get_ActionManager(int _char) {
    26.   return _actionManagers[_char];
    27.  }
    28.  
    29.  static public void Create_PositionManager(int _char) {
    30.  
    31.   _positionManagers[_char] = GameObject.Find(_charNames[_char]).AddComponent(typeof(PositionManager)) as PositionManager;
    32.  
    33.  }
    34.  
    35.  static public PositionManager Get_PositionManager(int _char) {
    36.   return _positionManagers[_char];
    37.  }
    38.  
    39.  
    40. }
    41.  
    42. // skipping ActionManager & PositionManager, they're the same as in "before" example
    43.  
    44. public class MainClass : MonoBehaviour {
    45.  
    46.  
    47.  void Main() {
    48.  
    49.   int _char = _someCharNumber;
    50.  
    51.   Superclass.Create_ActionManager(_char);
    52.   Superclass.Create_PositionManager(_char);
    53.  
    54.   // stuff using Superclass.Get_ActionManager(_char) &   Superclass.Get_PositionManager(_char)
    55.  
    56.  }
    57.  
    58.  
    59. }


    In short, I'm dramatically reducing the amount of instances in memory by setting all the storing and calling in a unique class, accessible from anywhere because it is static (= global).


    I'm not done yet with the conversion of all my code, but all I can say is that before that struct / static conversion thing, I was hitting a frametime of 19ms to 23ms on a certain stage of my game (see attached picture).

    Now, I'm hitting 17.2ms to 22ms.

    iPhone Unity internal profiler stats:
    cpu-player> min: 5.2 max: 15.8 avg: 10.2
    cpu-ogles-drv> min: 3.2 max: 9.7 avg: 3.9
    cpu-present> min: 1.0 max: 13.7 avg: 2.8
    frametime> min: 15.3 max: 33.2 avg: 17.2
    draw-call #> min: 6 max: 6 avg: 6 | batched: 7
    tris #> min: 10164 max: 10164 avg: 10164 | batched: 392
    verts #> min: 5198 max: 5198 avg: 5198 | batched: 264
    player-detail> physx: 0.3 animation: 1.4 culling 0.3 skinning: 5.5 batching: 0.1 render: 2.4 fixed-update-count: 0 .. 2
    mono-scripts> update: 0.1 fixedUpdate: 0.0 coroutines: 0.1
    mono-memory> used heap: 700416 allocated heap: 1417216 max number of collections: 0 collection total duration: 0.0

    (All tests on 3GS)

    20ms is an average of 50 FPS.

    17ms is 58 FPS.

    That makes me very optimistic on final FPS, thanks to this kind of code architecture optimization.
    Attached Images  


  2. Location
    japan
    Posts
    989
    look interesting

    need to play with struct too I guess ^^,

    thanks for the tip n0mad


  3. Location
    Perth, WA
    Posts
    146
    You know that this:

    mono-scripts> update: 0.1 fixedUpdate: 0.0 coroutines: 0.1
    Is the time spent in your scripts? i.e. what you are optimizing?

    Most of your time is spent skinning your models and general Unity engine work (sending verts to the graphics chip, scene management). You'll get better improvements by trying to reduce draw calls or vertex count of characters.
    Spinfast & Staring Man Games
    Baseball '09
    Cricket '09
    Pools of Blood


  4. Location
    Paris
    Posts
    3,730
    Quote Originally Posted by Poita_
    You know that this:

    mono-scripts> update: 0.1 fixedUpdate: 0.0 coroutines: 0.1
    Is the time spent in your scripts? i.e. what you are optimizing?

    Most of your time is spent skinning your models and general Unity engine work (sending verts to the graphics chip, scene management). You'll get better improvements by trying to reduce draw calls or vertex count of characters.
    Actually this is the profile for the already optimized code.

    Also, this is taken from the fastest profile of all. As there is no "min-max-average" time for time spent in code, you could not know those key values that would tell how fastest is it with such an architecture.

    Optimization with struct is made by better memory access, and with static is made by far less instances (less eaten memory), and far less null testing (every time you access an instance, there is a null testing).

    Finally, with so much graphical features and details, 6 draw calls is the strict minimum I can reach. 10k verts total can't be better too if I don't want my chars to look like Virtua Fighter cubes and my background to be pointless.

    I'm judging the performance gain by watching the frametime, not other details, as frametime is the final ... frame time
    (even if it's more relative to different parameters than code time)

    At last, I can really feel an overall performance boost. Plus an important feature that is direct access to any property of any fighter from anywhere (no more getters/setters/infinite parameter calls between each other).

    Actually this is not the first time someone posts struct/static optimization experience in these forums
    I've seen those several times here since I joined (last year).


    edit :

    Here I found an old Profiler from the very same level :

    iPhone Unity internal profiler stats:
    cpu-player> min: 8.1 max: 13.5 avg: 10.8
    cpu-ogles-drv> min: 2.9 max: 5.3 avg: 3.4
    cpu-present> min: 0.5 max: 34.0 avg: 15.7
    frametime> min: 16.5 max: 49.2 avg: 32.2
    draw-call #> min: 7 max: 7 avg: 7 | batched: 0
    tris #> min: 9134 max: 9134 avg: 9134 | batched: 0
    verts #> min: 4488 max: 4488 avg: 4488 | batched: 0
    player-detail> physx: 0.4 animation: 1.7 culling 0.3 skinning: 3.7 batching: 0.0 render: 2.1 fixed-update-count: 0 .. 2
    mono-scripts> update: 2.0 fixedUpdate: 0.3 coroutines: 0.2
    mono-memory> used heap: 430080 allocated heap: 1429504 max number of collections: 0 collection total duration: 0.0
    (from this thread)

    In this old version (was already iUnity 1.5), there was much less code, and much less graphical features/manipulations.


    Before :

    mono-scripts> update: 2.0 fixedUpdate: 0.3 coroutines: 0.2

    Now :

    mono-scripts> update: 0.1 fixedUpdate: 0.0 coroutines: 0.1


    Speaks by itself

    One last point, I'm not putting that much horsepower on code as you can see. Essentially, it is hitTests, movement, combos, HUD display update, and background dynamic animation.

    I guess the upgrade would be much more visible in complicated types of games.


  5. Location
    San Francisco & Caribbean
    Posts
    158
    What you are experiencing here is performance increase due to data locality. By using native arrays and structs you are tightly packing all your data into one big chunk of contiguous memory, increasing your chances of hitting your L1 cache.

    .Net (Mono) is a beast and the chances of anything being in cache is almost a miracle. But the profiler shows that their is indeed performance to gain.
    rozgo - developer of
    Rawbots - robots crafting sandbox game
    http://rawbots.net
    twitter.com/rozgo


  6. Location
    Paris
    Posts
    3,730
    Quote Originally Posted by rozgo
    What you are experiencing here is performance increase due to data locality. By using native arrays and structs you are tightly packing all your data into one big chunk of contiguous memory, increasing your chances of hitting your L1 cache.

    .Net (Mono) is a beast and the chances of anything being in cache is almost a miracle. But the profiler shows that their is indeed performance to gain.
    That's very interesting

    So as I guess that L1 doesn't have a very big size, this perf gain would only work for small sized code engines ?

    (like platform/fighting games, but not RTS/RPG games for example)


  7. Location
    San Francisco & Caribbean
    Posts
    158
    This, in theory, could work with any game. As long as you can find the best algorithm to access your data (like spatial algorithms).

    For instance, a very simple case, imaging having a huge array. And for every consecutive 20 elements you have an entry in another array that says if this bucket (20 elements) is dirty and needs to compute, else skip (all 20). From a list of a thousand elements you'll be able to skip most of it. You will miss a few caches, but most will just be a prefetch away.

    Core engineers are good for these kind of stuff, if you can ask the right questions, they will have a good data structure and spatial algorithm for you.

    I learned that its very important to understand your data and access behavior, in order to get a good solution from a core engineer (or wikipedia). Those dudes are grumpy most of the time.

    EDIT: Look at google for the best example, that its not the size of your data, but the power of your spatial algorithm. Every time I think the iPhone can't handle something, I just think, if google can search the world in miliseconds, I can squeeze in a bit more data. (Not really a symmetric comparison, but what the heck)
    rozgo - developer of
    Rawbots - robots crafting sandbox game
    http://rawbots.net
    twitter.com/rozgo


  8. Location
    Paris
    Posts
    3,730
    Quote Originally Posted by rozgo
    This, in theory, could work with any game. As long as you can find the best algorithm to access your data (like spatial algorithms).

    For instance, a very simple case, imaging having a huge array. And for every consecutive 20 elements you have an entry in another array that says if this bucket (20 elements) is dirty and needs to compute, else skip (all 20). From a list of a thousand elements you'll be able to skip most of it. You will miss a few caches, but most will just be a prefetch away.

    Core engineers are good for these kind of stuff, if you can ask the right questions, they will have a good data structure and spatial algorithm for you.

    I learned that its very important to understand your data and access behavior, in order to get a good solution from a core engineer (or wikipedia). Those dudes are grumpy most of the time.

    EDIT: Look at google for the best example, that its not the size of your data, but the power of your spatial algorithm. Every time I think the iPhone can't handle something, I just think, if google can search the world in miliseconds, I can squeeze in a bit more data. (Not really a symmetric comparison, but what the heck)

    Yeah, that's a good comparison.

    Now you made my SuperGeek side grow even bigger
    I just love gamemaking.

    Then I can't imagine how a pain should it be for AAA console / pc games to think the way you described with such millions of lines of code.


  9. Posts
    1,082
    Interesting stuff. I'll look into using structs myself.

    I may be completely missing the point here, but I don't get how calling a function like Superclass.Get_PositionManager(_char) can be any faster than having this stored in class properties? If you would have an Update function in your MainClass, and it would need to access the positionmanager, would it call these functions each time? Or would you have the update method in your superclass?
    UniTile - 2d Tile-based Map Editor for Unity - now also available in the Asset Store.

    My games :
    - ReRave (for iOS and Arcade)
    - Robin Hood - Archer of the Woods (published by Chillingo/Clickgamer)
    - Pig Shot (published by Nexx Studio)
    - Captain Ludwig
    - Pinkvasion
    - Deep
    - iPigeon


  10. Location
    Paris
    Posts
    3,730
    Quote Originally Posted by Smag
    Interesting stuff. I'll look into using structs myself.

    I may be completely missing the point here, but I don't get how calling a function like Superclass.Get_PositionManager(_char) can be any faster than having this stored in class properties? If you would have an Update function in your MainClass, and it would need to access the positionmanager, would it call these functions each time? Or would you have the update method in your superclass?
    I practically don't use Update() at all, 99% only coroutines, but anyway :

    Static typing puts the object in memory from the beginning to the end of the process. This object is not instanciable, which means you can only access to it by calling it directly from its class : class.MyObjectFunction();

    When you create an instance, not only does it use more memory (sorry but I can't find a proper illustrative link anymore), but you got to multiply it by the amount of classes using it. So in the end, this can take some memory.
    Plus, every time you are accessing an instance, the compiler does a null test to provide an appropriate NullException if instance has not been created.

    Static objects are always there, so they can never send a NullException, therefore don't need null testing, and don't need multiplicity as they are unique.


    Each time you want to access these static objects from elsewhere, you have to type the whole path to access it (Superclass._method(_param)).

    This could be kind of visually heavy when you got tons of operations to perform. A simple way to avoid that is to create shortcut methods :

    MyClass _myFunction(_params) {
    return Superclass._myClass._myFunction(_params)
    }

    Which makes it as visually clear as an instance, except you just put parenthesis next to it.

    Before :

    myClass._myFunction(_params);

    After :

    _myFunction(_params);

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •