Search Unity

SpeedTree performance issues

Discussion in 'General Discussion' started by Pecek, Jun 11, 2015.

  1. Pecek

    Pecek

    Joined:
    May 27, 2013
    Posts:
    187
    After about 2 weeks of trying to get everything working properly I'm out of ideas. Did anyone managed to use ST without extremely low performance? I disabled wind, smoothlod, I even modified the prefab(for the sake of testing I deleted everything but the billboard as most of the trees in my scene are billboards anyway- so I only have a parent with a lod group component, and a child with transform, billboard renderer, tree, and material component(even tried to place them as gameobject - so I got rid of the lodgroup component and the parent)). I'm out of ideas, my cpu usage is over the roof(22ms, rendering is about 2.7-3ms - without any kind of game logic, basically I have a camera and 11.2k billboards - they are batched together), I made a build, but didn't noticed much performance improvement. I have no idea what's going on, why is the billboard renderer so slow? Am I missing something?
     
  2. angrypenguin

    angrypenguin

    Joined:
    Dec 29, 2011
    Posts:
    15,620
    We can't help much without more info. Things that would help:
    - The device you're testing on. For in-Editor tests, the specs of the computer.
    - A screenshot of the scene.
    - Version of Unity.
    - Any relevant options/settings you've tried.
     
  3. Pecek

    Pecek

    Joined:
    May 27, 2013
    Posts:
    187
    @angrypenguin
    Sure, sorry. I'm only testing on pc(fx6300, gtx660, 16gb). I tried to make a screenshot but nothing really can be seen(as I wrote in my original post, I made a scene just for testing the billboard performance, so there is nothing but 11k trees and a single camera pretty far away from them(far clipping plane set to 5000). I'm using 5.1, but I had the same issue with the earlier releases as well. Also may worth mentioning I tried it on another machine(fx6100, amd 4830, 4gb) and the performance was pretty much the same.
     
  4. jimmikaelkael

    jimmikaelkael

    Joined:
    Apr 27, 2015
    Posts:
    796
    I have a similar problem with speedtrees eating the drawcalls count. I'm far from 11K trees but it seems to me that it is poorly batched. I don't know if it makes a difference but I'm using deferred rendering path.
     
  5. Pecek

    Pecek

    Joined:
    May 27, 2013
    Posts:
    187
    As far as I know what you might experience is just the limitation of dynamic batching(static batching not working on speedtree trees), my main concern is billboard rendering(I have a few hundred draw calls as well when I'm using speedtree properly(~not just the billboards) and the camera is close to them. In the old system I can easily make like 150k trees without any sort of performance issues(actually a lot more, but don't see the point of filling every inch with trees), that's why I'm surprised by speedtree's billboard performance as they are batched properly(in the scene mentioned above statistics says 11 199 saved by batching - and that seems correct, when I first tried it with default settings(smoothlod on) I got way worse fps).
     
  6. N1warhead

    N1warhead

    Joined:
    Mar 12, 2014
    Posts:
    3,884
    I was having major performance hits because I was using Water4. If you're using Water4 ANYWHERE on your scene try unticking it and all my speedtree trees and stuff worked back at normal. Heck Water4 was giving my 8 Core CPU 200MS!

    But it worked vice-versa - I could have speed-tree trees and water (disable the trees) - and it would work flawlessly, or I can disable the water and would work again flawlessly.

    So I guess speedtree and water4 don't mix I found out.

    (Not sure if you're using it, if so try not using it and see what happens).
     
  7. Deleted User

    Deleted User

    Guest

    Well 11K trees is a ridiculous amount, you need to stream them at a certain viewable distance and at least offer some sort of occlusion culling. Which is Umbra in Unity...#

    Yes of course you can have a million trees "in a scene" but not in view..! Even in AAA games, you'd never have a tenth of that in view at any one time..

    Did you try adjusting the LOD's?

    Deferred rendering can have an impact with Transparent / AC materials also..

    No wonder..!
     
    Last edited by a moderator: Jun 11, 2015
    angrypenguin likes this.
  8. Pecek

    Pecek

    Joined:
    May 27, 2013
    Posts:
    187
    @N1warhead
    Nothing in the scene, just trees. :) I guess Water4 has real time reflection, that might answer the mystery behind the performance troubles. :D

    @ShadowK
    Well, I do wonder. :) In the old system I have this when I place 123k trees(please note that drawing distance for terrain trees still locked at 2000, and some of them are offscreen here, but still, that's at least 10 more with 10 times less cpu usage - not to mention the 16 terrains bellow them)
    old_trees.jpg

    And I have this when speedtree trees are placed as individual game objects(even worse when I place them on a terrain as then I have to add a LODGroup component to them which is of course needs pretty signifacant cpu power because all of them can be seen in this scene and as you said, 11k is a lot.
    speedtree_object.jpg
    You can clearly see the difference, something must be going on with the new speedtree billboard system(the trees on their own perform well I think, but when I have a lot of them no matter what I do they use more resources then they should IMO).

    Btw look at my "tree" setup, it's just a billboard, nothing more.
    speedtree_steup.jpg
    I tried every rendering modes from legacy vertex lit to deffered, it gained like 2-3 fps, I wouldn't call that a solution.

    In my opinion we should be able to place trees like this without eating up all the resources. But we can't, not even when all of them are just billboards, and that is a problem(unless I'm doing something completely wrong, in that case please tell me).
    http://i.imgur.com/83Qzf.jpg
     
  9. Deleted User

    Deleted User

    Guest

    @Pecek

    Some things about SpeedTree, firstly the poly / tris count is higher as they generally look much better. So that's a massive factor, 11K X 5,500 without LODS is around 60 Million tris even at a low LOD (2.5K) it's still 27 and a half million tris.

    But of course not all the trees are going to be LOD3 or higher, they'll be billboards at long distance. Now firstly you're using a different shader and secondly you've not included the profiler output as to what's causing the difference.

    What types of batching are you using, have you used Umbra?

    One thing I noticed about the LOD system is there's a culling overhead applied for LOD transitions that can get a bit hairy when there's a mass load geometry to transition. That could also factor in..
     
  10. Pecek

    Pecek

    Joined:
    May 27, 2013
    Posts:
    187
    @ShadowK
    This is the builtin ST billboard shader(it's the European White Birch Desktop from the Desktop Tree Package, I changed nothing but got rid of everything except billboard, so in this scene there is no lod going on at all as nothing have a lod group component).
    I marked them as static, but as far as I know they can't be static batched(and I didn't saw any difference in performance either). If the statistics window is correct, then I have way more tris when using the old tree system (211k vs 44.5k) but still 9 times more fps, so I don't think the problem is with the tris count - I might be wrong though, I used Unity mostly for 2D, so if I should look elsewhere please tell me.

    Old tree system
    oldtree_profiler.jpg
    vs speedtree
    speedtree_profiler.jpg
     
  11. Deleted User

    Deleted User

    Guest

    @Pecek

    As said most of your issues aren't down to speedtree, they seem down to culling and lighting so says your profiler..

    I'd need to see deeper down to find out what..

    Have you tried running umbra yet?
     
  12. Pecek

    Pecek

    Joined:
    May 27, 2013
    Posts:
    187
    @ShadowK
    I have made the exact same test but replaced the billboards with cubes(transform, mesh filter, mesh renderer - disabled everything -, and the default material), now I have 17ms cpu and 6.5ms gpu usage - for me this clearly says something is wrong with the speedtree billboards, billboards should be a lot faster than anything else. Also if I mark the cubes as static they are even faster(cpu 14.5ms gpu 1.1ms) - actually this is quite amazing, I didn't expected to see that.

    My original post is about why speedtree billboards are so slow compared to the old tree billboards, or the particle system's billboards, and what can I do about it. Occlusion culling in this case wouldn't help as all of them can be seen(in fact it performs worse, also even the manual mentions you shouldn't use OC in a forest scene because it would use way too much memory - at least they shouldn't be occluders).

    I made a package if you would like to take a look(8.2k billboard trees - the one comes free with Unity), I would be more than happy to hear your thoughts about it - please note I put it together in a hurry, so there might be some settings I forgot to set which could lead to obvious performance issues.

    https://www.dropbox.com/s/87yzapsisvpxvqb/SpeedTree Billboard Test.unitypackage?dl=0
     
  13. chingwa

    chingwa

    Joined:
    Dec 4, 2009
    Posts:
    3,790
    Part of the reason SpeedTree billboards are slower than the normal Unity tree billboards is the billboard fade feature, which has a real detrimental impact on performance.. If you go to each individual tree's settings you can set the "Fade Out Width" to 0, which should help some with the final performance of the billboards, especially since you have so many.

    Of course, now they will simply pop-in and pop-out once they reach the visible threshold. You also may want to experiment with the "culled" position, raising it up to maybe 3 or 4%. By messing with the settings you can definitely eek out some extra performance over the defaults.
     
  14. netvortex_dc

    netvortex_dc

    Joined:
    Jan 13, 2014
    Posts:
    126
    SpeedTree is broken on Unity, but i can recommend using them on Unreal4. Performance is much better and they also look a lot nicer.
     
  15. larsbertram1

    larsbertram1

    Joined:
    Oct 7, 2008
    Posts:
    6,902
    hi there,
    you may try to get rid of the "tree" component on your billboard and see if this speeds up rendering.
     
    Last edited: Jun 13, 2015
  16. Pecek

    Pecek

    Joined:
    May 27, 2013
    Posts:
    187
    I made another test, about the same amount of trees, but instead of placing one at a time I changed the Conifer_Desktop in the modeler so now it places 10 trees. I left everything at default(so smoothlod is enabled), look at the result.
    speedtree_cluster.jpg
    ..but of course this way they can't be batched, so we have changed one problem to another, and tris count in this scene can climb up to almost 3m(but this "solution" might be good for background stuff as billboards are the same triangle count wise, so they can be batched - even a lot better as this way a billboard can cover 10 trees). It's also harder to work with, but the performance gain is extremely huge. Maybe the best workaround would be(for open world games) to use the cluster version for far away terrains, and the normal for the closer ones.

    ..but damn, why do we have to make something like this? I was so hyped for SpeedTree, the technology is simply amazing, I really hope they will give it more love in the future(but until then this solution looks okay for most of the time - unless you want to chop those trees, or placing them on )

    @chingwa
    In the scene above the billboards aren't using the fade effect at all - unless I missed something essential, can you set the fade effect to billboards without using the lodgroup component? If so, where can I find those settings? I really did nothing in the past 2 weeks but tried every possible setting I could come up with, but didn't found anything related to fading besides the lodgroup.

    @larsbertram1
    Thanks for the tip, I will! :)

    Edit:
    it made a pretty big difference! But still too slow for large landscapes unfortunatly(of course you won't need such a dense forest, but a few km terrain can use this amount pretty quickly, and sometimes they can be(or at least should be) seen).
    Here are the results:
    With the Tree component
    billboard_w_tree.png
    Without it:
    billboard_wo_tree.png
     
    Last edited: Jun 12, 2015
  17. AlanMattano

    AlanMattano

    Joined:
    Aug 22, 2013
    Posts:
    1,501
    Just to let you know, sometimes billboards with transparent alpha texture and bump are slower than making the actual geometry and adding a lit shader without bump and transparency. This is proved in the grass where the mesh is not so complex.
     
    Last edited: Feb 21, 2020