Search Unity

Any iPad performance optimization wisdom beyond combine everything/fillrate sucks?

Discussion in 'iOS and tvOS' started by Angelo, Nov 11, 2010.

  1. Angelo

    Angelo

    Joined:
    Sep 23, 2009
    Posts:
    6
    Hey everyone, I'm working on a space shmup for iPad, and I'm trying to get more out of it. More bullets, enemies etc.
    I'm looking for any wisdom on optimization, and I'll list here what I have setup so far.

    I've used SpriteManager to combine most objects. I'm down to five draw calls, one for the game objects, one for the planet in the background, one for the enemy mothership and one for all the sprite based text in the game.

    The main atlas is PVRTC compressed (rgb 4bits) and the shader is a tweaked additive shader that doesn't take a tint color.

    Each object, bullet etc. has it's own GameObject, behaviour script and sphere trigger/kinematic rigid body.
    Would managing them all from one gameobject/script offer any speed increase?

    Here's a screencast of the game in the Unity editor, http://screenr.com/Pok
    (the pauses are the video lagging not the game)
    And a screenshot of the same level on the iPad:

    That Venus level runs 28-50 fps on the iPad, depending on what's going on. I really need the game to be more stable than that.

    Anything might help, thanks.
     
  2. hippocoder

    hippocoder

    Digital Ape

    Joined:
    Apr 11, 2010
    Posts:
    29,723
    Big textures also slow things down via bandwidth, not just fillrate.

    Draw calls are not all that bad on the ipad or iphone. You can have 10-15 draw calls with no change in fps.

    Do you use render to texture? That is bad.

    Do you use GUI in any form? Thats bad!

    Fillrate: now fillrate is an interesting beast. If you have 100 objects spread around the screen not overlapping, it will be several times faster to render than if they all overlap. It becomes increasingly more expensive the more alpha things share the same pixel, exponentially even.

    But I noticed you said you had a gameobject with its own scripts for every single item. I think this is the problem, or part of the problem. If you're doing bullets you want one gameobject handling them all.

    Also lets look at the garbage collector. How much work is that doing at the moment since I suspect thats one of the things that can go bad. Can you describe the nature of your slowdown? is it chuggy? does it happen on and off? or is it all just dog slow? More info can help us diagnose.

    Do a quick test, just draw them on screen with no script attached in a typical manner (ie not all overlapping one place, but spread out as you would expect in the game).
     
    Last edited: Nov 11, 2010
  3. Angelo

    Angelo

    Joined:
    Sep 23, 2009
    Posts:
    6
    Hey quick reply, thanks!

    I'm not using render to texture, I don't use GUI except for that fps counter.

    Okay I'll try managing the bullets from one GameObject.

    The games speed seems to be directly related to how many things are on screen, so it just feels slower but not chuggy/choppy. It does however skip every 9-10 seconds now that you mention it. I have a 13 second audio loop playing, maybe it has something to do with that?

    I'll do that test and post the results.

    update:
    Looks like managing them from one GameObject will help, hopefully that will do the trick. I'll post an update when I implement that.
     
    Last edited: Nov 12, 2010
  4. warmi

    warmi

    Joined:
    Aug 12, 2010
    Posts:
    18
    Nope, once you have blending on, it is irrelevant how your object overlap etc ... blending is applied to the framebuffer and cost as much regardless of your alpha values ( completely transparent parts of your sprites are just as expensive as everything else so minimize your entire sprite pixel coverage ( tighter sprite boundaries))
     
  5. marjan

    marjan

    Joined:
    Jun 6, 2009
    Posts:
    563
    How do you build?

    The fastest setting in unity 3 should be:
    + Armv6 + Armv7
    + strip bytecode
    + fast but no exeption

    inside the resulting appcontroller.mm you should adjust some values:
    #define USE_DISPLAY_LINK_IF_AVAILABLE 1
    #define kFPS 60.0 (Well for testing i use 60 or higher)
    #define kAccelerometerFrequency 1.0 (this value is the frequenzy the accelerometer is used. if you don´t use it at all set it to 1)
    and one more define that toggles oppengl es 2 or 1.1. Set that to 0.

    there is some other values you can tweak, see the documentation on that
     
  6. Alexey

    Alexey

    Unity Technologies

    Joined:
    May 10, 2010
    Posts:
    1,624
    and yes, can you paste here some internal profiler readings? Just to play safe and optimize where it is needed 8).
     
  7. mudloop

    mudloop

    Joined:
    May 3, 2009
    Posts:
    1,107
    Sorry for hijacking the thread, but I have a question about this:
    In my new menu-system, I use one 2048x2048 image for everything on iPad and retina devices (1024x1024 version on older devices). I made one big image so it's easy to skin and so it all batches - and I noticed it is a bit slow on my iPad but not on my iPod Touch 4G which I thought had similar hardware, just a different resolution. Would splitting it up into 4 1024x1024 images help a lot? It's quite a bit of work (remapping coordinates etc) so I'd like to know if it makes much difference before I do this.
    Here's a web demo (with temporary graphics I found online), in 480x320 resolution : http://mudloop.com/menu_system/
     
  8. hippocoder

    hippocoder

    Digital Ape

    Joined:
    Apr 11, 2010
    Posts:
    29,723

    No, no :) It won't help at all.. it doesn't care if its on another texture or not, its all it having to read that many pixels. I don't think you can do much more to speed it up except use 16 bit textures if possible in that particular instance. It might help if you are generating mipmaps too.

    The problem is the iphone 4g has the same fillrate as the 3GS and you're on retina so I doubt its the actual texture size causing the problem, just you're drawing more with the same hardware. Although.... you could try the 1024 texture on 4g just to see if its a texture fetch issue

    You mis read what I said. Thats not what I meant at all.

    1. if you have 100 sprites with alpha 0 spread around the screen;
    2. it will be faster than 100 sprites with alpha 0 in the same place

    That is because the hardware will re-sample every single pixel it overlaps
     
    Last edited: Nov 12, 2010
  9. mudloop

    mudloop

    Joined:
    May 3, 2009
    Posts:
    1,107
    Thanks for your response. But on my 4G it works smoothly (60fps), just not on my iPad, which I think is weird - the iPad doesn't have that many pixels more than the iPod Touch 4G. It's not a huge perfofmance loss though, so I might be able to get it back with some other optimizations.
     
  10. hippocoder

    hippocoder

    Digital Ape

    Joined:
    Apr 11, 2010
    Posts:
    29,723
    I read a lot about splitting textures too, I just don't understand it. The ipad and iphone 4 are not worlds apart in hardware. If you could find out why, that would be super cool :)
     
  11. Dreamora

    Dreamora

    Joined:
    Apr 5, 2008
    Posts:
    26,601
    its unhappily not that wierd smag

    you can redraw the screen 3-4 times per frame pixelwise on the iphone 4 / itouch 4g (4x more pixel than 3GS) but only 2-3 times on ipad (5.2 times more pixel than 3GS), so if you run fine on the 4th gen, you might still run close to the edge of the iPad and hit the border from time to time in which case the performance drops significantly (keep in mind, unlike drawcall limitation which is a "relative limitation", the fillrate is absolute: Hit the border and your are "killed", not "hit the border and it degrades" because you have a fixed amount of pixels you can render per second, this results in a fixed amount of pixels you can render per frame which you get by dividing the amount of pixels / desired FPS)

    To overcome such overdraw kills as seen here, you will not be able to avoid OpenGL ES 2.0 and using pixel shaders that combine the stuff prior to rendering it on a RenderTexture and/or the commonly used optimized meshes that represent your object in a tighter form for situations where you use a lot of alpha to "omit parts from rendering" ie cut holes with alpha. In such a case the mesh should cut that part physically as well. You can afford thousands of wasted polygons on 4th generation but especially on iPad, but you can not afford wasted blending pixels :)

    Also, ensure that you only blend stuff that needs blending. Don't blend background objects for example. if something is always the farthest back thing, fill the texture up with black instead of alpha if the backdrop is black and disable blending, that safes you considerable amounts of fillrate
     
    Last edited: Nov 12, 2010
  12. warmi

    warmi

    Joined:
    Aug 12, 2010
    Posts:
    18
    Fillrate problems are very hard to tackle.
    As far as 2d rendering , beside using compressed images ( which generally is not possible with pixel art) the only thing you can do is to go away from rectangular sprites and use non-rectangular tightly bound sprites.

    This works well for sprites with irregular shapes where majority of the sprite is solid only outline is transparent.
    Here is an example :
    http://www.warmi.net/tmp/sprite2.png

    The idea is to split your sprite in two parts : solid and transparent and define your own vertex coordinates for both parts.
    http://www.warmi.net/tmp/sprite1.png ( the white part is completely solid while the gray part is transparent)

    It complicates rendering a bit and trades fillrate for vertex processing time but if you have fillrate problems - this sort of approach can double your framerate.
     
  13. hippocoder

    hippocoder

    Digital Ape

    Joined:
    Apr 11, 2010
    Posts:
    29,723
    Nice tips Warmi. I agree, good tips.

    Special attention must be made with ui elements in 3d games. It becomes really easy to just waste huge amounts of fill rate on the user interface part.

    I notice in the above demo movie we see big overlays in the middle - I assume these are large transparent quads being overlaid in the same place? This is a "worst-case" scenario for the ipad. If you could make any big images which have a lot of alpha, into thin rotated quads which don't burn all your invisible (alpha 0 on the texture is drawn) fill-rate you probably would solve this issue right away.

    I see at least 4 large quads in the middle of the screen, these take up a quarter of the screen. Each quad thats drawn on top of each other does not mean 4x the fillrate it means over 8x the fillrate as the topmost quad will sample the pixels of the ones under that, and so on for the one under that (you get the idea). Either split all those images up into polygon shapes or find a way to combine them into one quad and just animate it via a sprite strip.

    To sum up: the ipad is excellent at drawing masses of polys with no transparency, but quite bad at drawing any transparent polys which are over other transparent polys.

    Please test your project without the 2 purple things and white rune in the middle, to see if that was the case.
     
    Last edited: Nov 12, 2010
  14. Angelo

    Angelo

    Joined:
    Sep 23, 2009
    Posts:
    6
    Okay wow, very interesting!

    @marjan
    I am building armv6, stripping disabled(don't have pro) and slow and safe.
    Changing it to armv6-armv7 / fast but no exceptions does help a bit, the same scene runs consistently at 40-45fps
    thanks, is there a general rule about using fast but no exceptions? It sounds unsafe, but what does it really mean?

    These are my settings:
    #define USE_OPENGLES20_IF_AVAILABLE 1
    #define USE_DISPLAY_LINK_IF_AVAILABLE 1

    //#define FALLBACK_LOOP_TYPE NSTIMER_BASED_LOOP
    #define FALLBACK_LOOP_TYPE THREAD_BASED_LOOP
    //#define FALLBACK_LOOP_TYPE EVENT_PUMP_BASED_LOOP

    #define ENABLE_INTERNAL_PROFILER 1
    #define ENABLE_BLOCK_ON_GPU_PROFILER 0
    #define BLOCK_ON_GPU_EACH_NTH_FRAME 4
    #define INCLUDE_OPENGLES_IN_RENDER_TIME 0

    // --- CONSTANTS ----------------------------------------------------------------
    //

    #if FALLBACK_LOOP_TYPE == NSTIMER_BASED_LOOP
    #define kThrottleFPS 2.0
    #endif

    #if FALLBACK_LOOP_TYPE == EVENT_PUMP_BASED_LOOP
    #define kMillisecondsPerFrameToProcessEvents 3
    #endif

    #define kFPS 60.0
    #define kAccelerometerFrequency 0.0

    @Alexey
    Here are some frames from the internal profiler while I play the same venus level:
    iPhone Unity internal profiler stats:
    iPhone Unity internal profiler stats:
    cpu-player> min: 17.4 max: 27.8 avg: 23.0
    cpu-ogles-drv> min: 0.4 max: 2.2 avg: 0.5
    cpu-waits-gpu> min: 0.1 max: 2.4 avg: 0.3
    cpu-present> min: 0.3 max: 0.7 avg: 0.3
    frametime> min: 18.8 max: 29.3 avg: 24.5
    draw-call #> min: 5 max: 5 avg: 5 | batched: 0
    tris #> min: 1514 max: 1514 avg: 1514 | batched: 0
    verts #> min: 3028 max: 3028 avg: 3028 | batched: 0
    player-detail> physx: 4.0 animation: 0.0 culling 0.0 skinning: 0.0 batching: 0.0 render: 2.2 fixed-update-count: 2 .. 3
    mono-scripts> update: 14.8 fixedUpdate: 0.0 coroutines: 1.0
    mono-memory> used heap: 1159168 allocated heap: 1536000 max number of collections: 0 collection total duration: 0.0
    ----------------------------------------
    iPhone Unity internal profiler stats:
    cpu-player> min: 15.8 max: 33.0 avg: 24.8
    cpu-ogles-drv> min: 0.4 max: 2.8 avg: 0.6
    cpu-waits-gpu> min: 0.1 max: 0.4 avg: 0.2
    cpu-present> min: 0.3 max: 1.4 avg: 0.4
    frametime> min: 19.6 max: 35.6 avg: 26.5
    draw-call #> min: 5 max: 5 avg: 5 | batched: 0
    tris #> min: 1514 max: 1514 avg: 1514 | batched: 0
    verts #> min: 3028 max: 3028 avg: 3028 | batched: 0
    player-detail> physx: 4.8 animation: 0.0 culling 0.0 skinning: 0.0 batching: 0.0 render: 2.3 fixed-update-count: 2 .. 4
    mono-scripts> update: 15.4 fixedUpdate: 0.0 coroutines: 1.1
    mono-memory> used heap: 1216512 allocated heap: 1536000 max number of collections: 0 collection total duration: 0.0
    ----------------------------------------
    iPhone Unity internal profiler stats:
    cpu-player> min: 20.7 max: 36.1 avg: 26.7
    cpu-ogles-drv> min: 0.4 max: 0.5 avg: 0.4
    cpu-waits-gpu> min: 0.1 max: 0.3 avg: 0.2
    cpu-present> min: 0.3 max: 1.6 avg: 0.4
    frametime> min: 22.2 max: 38.1 avg: 28.3
    draw-call #> min: 5 max: 5 avg: 5 | batched: 0
    tris #> min: 1514 max: 1514 avg: 1514 | batched: 0
    verts #> min: 3028 max: 3028 avg: 3028 | batched: 0
    player-detail> physx: 4.8 animation: 0.0 culling 0.0 skinning: 0.0 batching: 0.0 render: 2.2 fixed-update-count: 2 .. 4
    mono-scripts> update: 16.8 fixedUpdate: 0.0 coroutines: 1.2
    mono-memory> used heap: 1302528 allocated heap: 1536000 max number of collections: 0 collection total duration: 0.0
    ----------------------------------------


    Which is faster, a multiply blend mode or an alpha based transparency? Either way I gather that opaque shaders are the way to go for speed on the iPad. So I will convert the background planet and possible the alien mothership in the center to an opaque shader.


    @hippocoder
    Unfortunately, the center is already rotated quads:


    Running the game without the planet, and without the center art didn't offer too much benefit. However, after destroying the turrets the fps increased to ~60fps with a moderate number of enemies on the screen. Later, when more enemies were on screen, the fps dropped again.
    I am tinting some of the sprites with SpriteManager, which I believe sets the vertex color. Does this slow the rendering much?
     
  15. mudloop

    mudloop

    Joined:
    May 3, 2009
    Posts:
    1,107
  16. hippocoder

    hippocoder

    Digital Ape

    Joined:
    Apr 11, 2010
    Posts:
    29,723
    I can't speak for unity but they didn't affect performance at all in C++ / native iphone development, but then I didn't use any lights at all, and just used them to colour things. Try out smag's suggestions though and report back, as I would be interested to see how you got on.

    Did you try removing all traces of GUI? I hear stories that GUI is a nightmarish hog for iOS.
     
  17. Angelo

    Angelo

    Joined:
    Sep 23, 2009
    Posts:
    6
    Cool, thanks Smag.
    I took the shader Jessy made and edited it a little so it is additive. It helped, and the game rested around 40fps with safe and slow/armv6 and stayed close to 50fps with the fast but no exceptions/ armv6+armv7. Removing the frame counter gui element actually seemed to help, but if it did it was only a few frames. (I logged the fps with Debug.Log instead)

    I also turned off blending in the planet shader, which I think helped a lot.

    I'm still going to batch the handling of the enemy bullets, so I'll post if that helps.

    Here's the modified additive shader:
    I borrowed a lot from unity's additive shader, and removed alpha.
    If anyone knows how to make it faster, please let me know!
     
  18. Alexey

    Alexey

    Unity Technologies

    Joined:
    May 10, 2010
    Posts:
    1,624
    Ok, first on stability front:
    and your avg frametime is 25-30ms, until you do smth about it - define kFPS as 30 - it will be smoother
    Also,
    while
    means that you are most likely bound by gpu - your Update time alone prohibit you to run on 60 fps (16ms per-frame)
     
  19. Angelo

    Angelo

    Joined:
    Sep 23, 2009
    Posts:
    6
    Thanks Alexey, I did notice much more stability when I set kFPS to 30. I'll try to take some load off of my update step and see where that leaves me.
     
  20. longshot

    longshot

    Joined:
    Sep 24, 2009
    Posts:
    71
    I'm also doing a shmup style game on the iPad.

    Currently, most of the art content in the game is meshes + normal map + spec map, and I have a full screen animated image deformation going on the in the background. WIthout doing much optimizing at all, the game runs at a constant 30 fps.

    A few notes -

    1. Avoid doing anything with sprites on the iPad, if you can avoid it at all. The iPad only has fill rate issues if you are using transparency. Therefore, if you converted your ships to meshes made out of lines, I can almost guarantee you will be able to fill the screen with ridicolous amounts of enemies without it affecting the framerate. The same thing goes for bullets, try to think of a way to have them made out of fully opaque meshes( like having spinning diamonds, octagon shaped bullets instead of circles, etc ).

    2. The built in shaders are great for prototyping, but are made for general purpose use. Unity can't possibly know people are going for, so they can't cut corners in their shaders. I've been able to get large performance increases just by doing a naive rewrite of shaders, and I am no shader optimization wizard.

    3. Fragment processing on iOS devices tends to be slow. For the full screen image deformation effect I have going in the background, when I optimized it down to six instructions instead of 12, it nearly doubled the frame rate. Also, whereas there is little point in using lookup tables on desktop GPU's, since they can generally do the math faster than the lookup, the same is not the case on the iOS. If you are doing expensive operations on the iPad's GPU, like multiple trig functions, you might want to investigate encoding a lookup table to a texture.

    4. It's been mentioned already, but the iPad locks rendering to the refresh rate, so your game is either going to run at 30fps or 60fps. My game often times runs at 60fps, but the jump from 60 to 30fps is jarring, so I lock it at 30fps.

    Just to conclude, is that the main thing affecting your frame rate is your art assets. No matter how much optimization you do, you are going to hit a wall with sprites on the iPad. If you really want blistering fast frame rates with room to spare, I would recommend redoing your art assets as fully opaque meshes.
     
  21. Dreamora

    Dreamora

    Joined:
    Apr 5, 2008
    Posts:
    26,601
    1. Thats actually incorrect. Fillrate issues happen with any kind of blending, transparency is just the "most impacting form" cause it triggers also additional calculation overheads but this does not make multiply blending and alike any less desastrous for fillrate.

    2. Fully agreed :) though thats often especially a consequence of the fact that not the iphone ones are used, the desktop ones are total overkills (there are a few threads with opted shaderlab shaders for iOS)

    4. Good decision :)

    And the conclusion is 100% hit on (and on the ipad a viable way to go as the cpu has the power to pull it of rather easily)

    thanks for sharing your experience on the matter :)
     
  22. marjan

    marjan

    Joined:
    Jun 6, 2009
    Posts:
    563
    Well this thread is messed up quite a bit.

    Anyways, for your last question to me and your settings:

    1. Slow and save will usually only gain you 1 or 2 FPS. Your scripts will run a little faster, but cannot handle exxeptions. This means you app might crash if you don´t make sure this cannot happen. Toggeling this of will lead to exception handling, which means, in some cases a script exception will just be ignored and the programm will try to continue.

    2. I didn´t say you should use opengles 2.0. if these are really your settings i wonder how you got to 45 FPS.

    So, try to toggle this off and check if that makes some difference by using:

    #define USE_OPENGLES20_IF_AVAILABLE 0


    and last but not least, setting the kfps to something very high is not necessary for a final build. I just do that to check FPS inside a game with a small FPS calculating script. If you make a frameratecab to 30 you cannot see how fast the game would really run and where and when some Frameratedrops occur.

    Of course you can also look into some profiler logs, but i don´t like that to much.
     
  23. Angelo

    Angelo

    Joined:
    Sep 23, 2009
    Posts:
    6
    Okay, I did some tests.

    @Longshot Thanks for your input!
    I do have a modified shader that has helped with the games framerate, so to test the opaque shader idea I turned blending off in that shader. It didn't seem to help much, but I'm guessing that's because the fillrate is bad all around(or like Alexey said, my update step is too long to permit 60fps). Your suggestion sounds good, and goes in line also with what warmi suggested, building meshes that conform to the outlines of my sprites. If I go down that road, I'll post the results.

    @marjan
    I set USE_OPENGLES20_IF_AVAILABLE to 0 and I didn't see any difference, so I'm guessing that my problem isn't gpu bound at this point. I'm probably going to cap the game at 30fps, but the ideal is to have it run at 60fps since it makes the game feel more responsive.
     
  24. Dreamora

    Dreamora

    Joined:
    Apr 5, 2008
    Posts:
    26,601
    60 FPS on ipad is nearly impossible anyway, targeting 30 FPS on iPad / 4th Gen is what you likely should do at least if you intend to have stable FPS.
     
  25. mudloop

    mudloop

    Joined:
    May 3, 2009
    Posts:
    1,107
    Well, I'm currently getting 60fps on both my iPod Touch 4G and iPad, but there isn't too much going on yet, so I expect that to change. But it is possible for simple games I guess.

    Is there any way we could set the fps independently for iPad and other devices in a universal app? Don't know much about (obj) C or xcode, but kFPS seems to be a constant - but if it's not used outside of AppController.mm, guess it could be changed into a variable, right?
     
    Last edited: Nov 18, 2010
  26. Dreamora

    Dreamora

    Joined:
    Apr 5, 2008
    Posts:
    26,601
    kFPS indeed is a define. You could try if you can replace it with a global integer instead and see if it still works, never tried that. But in this case you could do so.
     
  27. mudloop

    mudloop

    Joined:
    May 3, 2009
    Posts:
    1,107
    Great, if at one point my framerate drops below 60 on iPad but not on other devices, I'll give that a go. Thanks!
     
  28. Moonjump

    Moonjump

    Joined:
    Apr 15, 2010
    Posts:
    2,572
    I'm getting 60 fps on iPhone 3G, iPhone 4 and iPad for my arcade shoot-em-up.

    I don't have any transparency on my in-game objects, and no lights, so I could use the Texture Only shader from the Unify Wiki: http://www.unifycommunity.com/wiki/index.php?title=Texture_Only

    I only have 1 GUI Text for my UI while playing. I removed the color fields from the TexturedFont shader in the wiki to make that even faster. If you look at screenshots of my game you will see icons for ship lives below the score. They are part of the same GUI Text. I replaced a font character with the icon to do that.

    Changing from OnGUI and changing the shaders both made significant differences to my frame rate.
     
  29. marjan

    marjan

    Joined:
    Jun 6, 2009
    Posts:
    563
    I recently made a Test: Build a mini Cooper Model, a Lot of Single Objekts, all together 35000 tris (well Maya says 45000 but in u3 its becoming less).
    So i put that in a Scene, one light, no materials, one script to rotate around the model. I got amazing 60 FPS or more.
    I set up some materials, only diffuse and diffuse specular without textures, still 60.
    I added a skydome to the Camera, FPS down to 50 FPS. I added tranparency to the windows(without textures), still 45.

    Added the anti alias Hack to appcontroller found here in the forum, FPS down to 25.
     
    Last edited: Nov 20, 2010
  30. Dreamora

    Dreamora

    Joined:
    Apr 5, 2008
    Posts:
    26,601
    antialias is hefty yeah, not only on the iOS devices, anywhere generally. Just that the other gpus no longer are that limited that it kills ;)
     
  31. renman3000

    renman3000

    Joined:
    Nov 7, 2011
    Posts:
    6,699
    I have a very related question.

    A. What exactly is a draw call?
    B. should i use sprite manager or some type of atlas builder?
    C. Is there a way i can monitor stats on the device? Stats being an option when run in unity to see performance.
    D. And if I want sprite animations + atlas building is Sprite Manager 2 the way to go?
     
    Last edited: Feb 29, 2012