AlphaTesting unadvised in 1.1, removed in 2.0

n0mad · Sep 22, 2009

Hello,

I've been recently in performance optimization problem that could concern everyone here : the alpha blending.

Basically, it's just about rendering overlapping semi-transparent textured objects properly, with the right depth sorting.

Actually, if we don't want pixels with alpha zero to hide what's behind them, we have to use AlphaTesting in one pass, and Blending in a second pass.

But .... unfortunately, I just found that AlphaTest was a huge hit in performance. In fact, this 2 passes trick practically cut the global FPS by a half.

I started a thread in the ShaderLab forums about it, feel free to join.

Cheers

edit : changed the title to fit the overall problem

n0mad · Sep 23, 2009

Update : I found a way to make things as fast as with no Alpha Test, in a single pass.

Just split all the translucent faces into separate objects in your 3D Tool.

This way, Unity will be able to apply batching, and you will be able to use a single pass shader using only Blending, without Zwriting. Just because you are already doing Z sorting, manually.

Advantage is it releases some processing power in your shader for other textures combines or other passes (or nothing if you want absolute speed).

The only thing you have to take care of will be your overall batching cost, if too many splitted faces. You can manage it with the Profiler.

One thing you could do to optimize it is grouping faces by depth level (unless your mesh has to be viewable from any side).

Actually, this works great for me, pushed my frametime from 34 (alphatest 2-pass) to 24, without losing any visual feature.

n0mad · Sep 23, 2009

A pretty useful sum-up of every possible sorting possibility.

As we can see, it's an endless pain ...

n0mad · Sep 24, 2009

Correction : you can use Zwrite with this method. Don't understand why it doesn't produce background killing on alpha zero pixels, but it doesn't.

Well, packing all those experiences I made, I can conclude it would be possible for Unity to create an optional polygon sorting parameter, splitting all the translucent triangles in virtual gameobjects, and then batch them back.

What I proceeded here and the results it gives is a clear proof that it would be at least 1.5 times faster than AlphaTest.

Would save a lot of time for intensive alpha users, as we won't be forced to manually split our translucent objects in the 3D tool anymore.

Alpha quads is the best way to create nice visual FX without having to use tons of polygons. For example, in a level I made, I used quads to display 2D textures of different islands in the background. Would have been much slower with real 3D models.

Speed is the nerve of war for iphone, so alpha tricks are not an option for many of us.

Would be awesome if iPhone Unity managed to have better support of alpha blending optimization, especially when the iPhone rendering pipeline is not designed for AlphaTesting, but Blending (see Apple articles I put in Shaderlab thread).

I read some blurry things about BSP (Binary Space Partitioning Trees), and their efficiency in fast Z sorting.

Here is an interesting topic about different sorting algorythms.

n0mad · Sep 28, 2009

I'm surprised nobody feels somehow affected by this alpha translucency performance problem ...

andeeeee · Sep 29, 2009

This is good work, n0mad, but you've actually got a link in your first post that redirects people to the ShaderLab thread and you are getting some response there.

n0mad · Sep 29, 2009

Hello andee, and thank you for your attention ^^

Yes, the Shaderlab has one participation (Thanks to the Shaderlab's ultra helpful Daniel Brauer).

Over the 2 discussions here and at Shaderlab, the only problem sorted is the answer to why we can't have proper Z sorting with Alpha Blending (or translucency, in another word).

But actually, the main, unresolved, undiscussed, and more deeply attached problem, is that with how Unity sorts Depth stuff, the overall Alpha management philosophy is pushing us toward the use of AlphaTest instead of Alpha Blending. It's a fact, and it's written in the Unity docs.

But AlphaTesting is clearly unadvised on iPhone, as Apple stated (sorry for the underline, I'm not feeling quite understood on this problem). So I'm finding this more of a Unity-side issue than coder-side.
What I'm presenting here is a dirty solution to not use AlphaTesting, but it's still dirty ...

It would be cool to know Unity's position with that, as Alpha is a very critical feature for performance and polygon saving.
Aras is stating not to use a lot of translucent objects in several threads, like other gamestudios don't. But sometimes, we can't !

If we want to achieve high framerates with nice visuals and the fewest possible triangles, we are forced to abuse of translucency. And I don't even talk about 2D games, and how Alpha Blending is vital for them

The only quite ignorant solution I found would be sorting depth by polygon, not objects, as manual polygon collapsing does work. But I'm not an expert in this, so I don't even know what I'm talking about :roll:

Just to make it clear at the end, to display proper translucency and proper Depth with Unity, we have actually 2 solutions :
- use AlphaTesting and see global FPS cut by an half.
- split every translucent polygons into separate objects under the 3D tool, and let the Unity Zwrite auto-batching do the rest. But what a mess when your scene have dozens and dozens of translucent objects !

Both of those solutions aren't viable, long term.

Would be really good to have a solid recognition of that problem from Unity staff, and how could it be managed by Unity in the future.

Thank you again for your attention, it helps

n0mad · Sep 30, 2009

The splitting faces method I mentionned above finally seems to be completely random with results ... Some huge levels manage translucent objects' depth properly, and some other don't, while they share the same shader ...

So let me ask the ultimate question, because I'm actually in a serious need of answer.

Fact : Apple don't recommand to use AlphaTest.

---> How can we correctly display overlapping translucent objects ?

Do you guys have tips, or tricks ? Or does nobody ever use translucency on several depth levels ? :roll:

n0mad · Oct 2, 2009

One week already since I posted the problem, and still no answer from Unity guys ... Maybe I should delete and recreate this thread in Unity Support forum for more visibility ?

Seriously, I feel like nobody cares at all. Not about my case, but the whole alpha depth mess.
Sorry for the forum spamming, but even if I sent a bug report 3 days ago that would probably resolve my problem, I don't think I would be the only one concerned by this. Performance and translucent Sprites are not an option on iPhone. So just need some clarification.

Just a "yes" or "no" to a simple question : "Is it possible to display proper translucent overlapping objects without alphaTest ? If yes, how ?"

A reminder about Apple recommandation concerning ES 1.1 :

Apple said:

Avoid Alpha Test and Discard

If your application uses an alpha test in OpenGL ES 1.1 or the discard instruction in an OpenGL ES 2.0 fragment shader, some hardware depth-buffer optimizations must be disabled. In particular, this may require a fragment’s color to be calculated completely before being discarded.

An alternative to using alpha test or discard to kill pixels is to use alpha blending with alpha forced to zero. This can be implemented by looking up an alpha value in a texture. This effectively eliminates any contribution to the framebuffer color while retaining the Z-buffer optimizations. This does change the value stored in the depth buffer.

If you need to use alpha testing or a discard instruction, you should draw these objects separately in the scene after processing any geometry that does not require it. Place the discard instruction early in the fragment shader to avoid performing calculations whose results are unused.
Click to expand...

Now just look at Khronos statement about AlphaTest in ES 2.0 :

Khronos said:

One of the first (and toughest) decisions we made for OpenGL ES 2.0 was to break backward compatibility with ES 1.0 and 1.1. We decided to interpret the “avoid redundancy” rule to mean that anything that can be done in a shader should be removed from the fixed-functionality pipeline. That means that transformation, lighting, texturing, and fog calculation have been removed from the API. We even removed alpha test, since you can perform it in a fragment shader using discard.
Click to expand...

So first, Apple says we shouldn't use AlphaTest in OpenGL ES 1.1, and now Khronos is just removing it !

We need an alternative, seriously. A clean one I mean, not something that "sometimes works, sometimes doesn't". Actually using Blending is precisely that versatile.
And Discard fragment shading should not be the answer, as CG is not possible for OpenGL ES 1.1 !

Please, check a box :

[ ] You should pay more attention to forums, as we already explained how to solve that problem.
[ ] Yes, we are aware of this problem, and working on an accessible solution.
[ ] We are too busy for the next decade and won't be able to fulfill this request.
[ ] Go on your own.

imparare · Oct 2, 2009

Nomad,

just saw this thread and I know very little about shaders. Our currently released game has a huge amount of sprites (we use SpriteManager heavily) and this shader:

Code (csharp):

Shader "SpriteMan"

{

Properties

{

_Color ("Color Tint", Color) = (1,1,1,1)

_MainTex ("Color (RGB) Alpha (A)", 2D) = "white"

}

Category

{

ZWrite Off

Cull Back

Fog { Mode Off }

Blend SrcAlpha OneMinusSrcAlpha

Tags { "Queue" = "Transparent"}

SubShader

{

Pass

{

SetTexture [_MainTex]

{

ConstantColor [_Color]

Combine Texture * Constant

}

}

}

}

}

The performance we get is great but the one area we have not looked at is the shader but I am sure it can be optimised. One of the main problems we had was Z Fighting and to overcome this we just extended the distance between the sprites (its an overhead ortho camera). This fixed things but now reading your thread I am wondering if our shader also has two passes and if so how we can put this down to 1 pass and elimate the Z fighting.

Maybe this is quite different that what you are discussing here I am not sure

n0mad · Oct 2, 2009

Hello imparare, thank you for your participation

The shader you put here has 1 pass, yes.

But actually, Z-fighting won't be resolved because you disabled Zwriting. And it's an endless hole, as if you enable it, you wil have artifacts with some sprites masking others, even if they are transparent.

Actualy, for 2D games, it's not really a problem, because there are too few artifacts with ortho cams.
But it's not safe either, as there could still be artifacts ...

These artifacts are dramatically exposed with 3D cams. For example, in one of my scene, I got a 2D sprite composed with 6 moving quads. It's just a waitress coming from left to right in front of the camera, perfectly aligned on its Z axis.

But depending on which portion it belongs in the cam viewport (left or right), her right arm will sometimes be put in front of the body, sometimes behind ...

Overall it's completely random.

For performance precise numbers, I had a frametime of 33 with AlphaTest, 2 pass, and 24 with Blending one pass.

imparare · Oct 2, 2009

Even with ZWrite On my understanding was that if you have a big enough distance between the objects then that would resolve the issue ?

thanks for the reply

n0mad · Oct 2, 2009

Yes, maybe, I didn't really dig this approach as it is not really "acceptable" : Some 3D objects are made of several translucent polygons, and therefore aren't able to match that distance.

But in 2D, it could do the trick, yes.

jtbentley · Oct 2, 2009

I read n0mad. I care, too.

n0mad · Oct 2, 2009

/hug

:wink:

ReJ · Oct 3, 2009

n0mad said:

Just a "yes" or "no" to a simple question : "Is it possible to display proper translucent overlapping objects without alphaTest ? If yes, how ?"
Click to expand...

NO. It is an inherit problem of deph-buffer based rasterizers. There is no generic solution to that. Period.

n0mad said:

Additional cost of AlphaTest is PowerVR architecture specific. It has no additional cost on other platforms.
Click to expand...

Alpha Test is NOT deprecated in ES1.1 by any means. AlphaTest was removed from ES2.0 as redundant because discard in shader provides exactly the same functionality and virtually the same performance.

n0mad said:

We need an alternative, seriously. A clean one I mean, not something that "sometimes works, sometimes doesn't".
Click to expand...

As I said there is no generic solution only rules of thumb:

* in 3D - split your transparent objects into convex shapes - allow engine to sort them or use renderQueue if you know better
* in 3D/2D (when you know exactly from what side your object will be always visible) - sort triangles manually in the Editor or just after the loading
* in 2D - you must sort your sprites yourself (usually you just arrange them in layers to minimize sorting) or use particle system forcing engine to sort them for you.

Automatic per-polygon sorting is NOT an option for an engine due to excessive CPU cost, if implemented in a generic way.

imparare · Oct 3, 2009

ReJ said:

in 2D - you must sort your sprites yourself (usually you just arrange them in layers to minimize sorting) or use particle system forcing engine to sort them for you.
Click to expand...

What is a particle system forcing engine ? (did a search but nothing found)

thanks

n0mad · Oct 3, 2009

Good morning ReJ,

thank you for your reply, I really appreciate.
I changed the "deprecated" word in the title to ensure nobody reads bad info.
But by deprecated I saw "we don't want you to use it, and it will be deleted in the next build".

ReJ said:

NO. It is an inherit problem of deph-buffer based rasterizers. There is no generic solution to that. Period.
Click to expand...

This is really bad news ...
I may sound cocky, but I'm really shocked that in 2010 there is still no stable implementation of such a critical feature as Alpha translucency in any 3D engine.

I mean basically ... Alpha is just "take the pixel under the one I'm rendering, and mix it with my alpha". And if the real problem is that primitives are culling each other, well ... I can't understand how there couldn't be an easy solution, like "take into account pixels that are already rendered when I render a new one".

In pure 2D engines, like Flash, there is no problem at all, even with hundreds of overlapping translucent clips. They sort nicely. I don't see where putting a perspective parameter in this would mess the whole stuff.

ReJ said:

Additional cost of AlphaTest is PowerVR architecture specific. It has no additional cost on other platforms.
Click to expand...

Which could explain why Khronos doesn't care about that precise problem.

But still, iPhone devs who use Alpha like me are really stuck in the mud with this.

ReJ said:

As I said there is no generic solution only rules of thumb:

* in 3D - split your transparent objects into convex shapes - allow engine to sort them or use renderQueue if you know better

Click to expand...

Ok, but isn't RenderQueue Shader wise ?
Which means we can't use it properly, as we would be forced to put a different shader for each translucent triangle :roll:

On the other hand, I tried to split objects into convex shapes, but there are still depth artifacts (submitted a bug report for that). So it is not safe.

ReJ said:

* in 3D/2D (when you know exactly from what side your object will be always visible) - sort triangles manually in the Editor or just after the loading
Click to expand...

Sorry, what type of Editor sorting do you mean ?
(this 3D/2D would be my case, yes)

`

ReJ said:

Automatic per-polygon sorting is NOT an option for an engine due to excessive CPU cost, if implemented in a generic way.
Click to expand...

I understand. Would there be a way to do this automation manually (no pun intended) ? By script ?

Also, an important discovery :
I made huge researches over the net about that problem. And I found numerous devs advising to use the function glTexEnv() in order to mask portions of primitives. This function uses a Blend mode too, so I believe it would be way much faster than AlphaTest. Principle is to blend the primitive with an alpha, turning it into a masked one, and then put the final texture on it.

I didn't find this function in GL class. How could it be achieved ?

And one last question : How could it be possible to write manually to the Z buffer, instead of letting Unity do the job ? This could resolve many problems, I think.

Thank you ReJ for taking time to answer.

n0mad · Oct 5, 2009

Up

Sidenote : I found a way to maximize optimization with AlphaTest, by separating Alpha Objects and Opaque Objects into 2 distinct batched meshes, in my 3D Tool.
Then giving them 2 materials, one for Alpha with Alpha Testing + Render=Transparent, and one other with no AlphaTest nor Blend + Render=Geometry (this RenderQueue is the most optimized, according to the docs).

I'm pretty satisfied with the final FPS, as long as there aren't to many translucent triangles in my scene.

But still, I don't believe this is the most optimized method.

Questions in the previous post are still up to date, then

tomvds · Oct 5, 2009

n0mad said:

I mean basically ... Alpha is just "take the pixel under the one I'm rendering, and mix it with my alpha". And if the real problem is that primitives are culling each other, well ... I can't understand how there couldn't be an easy solution, like "take into account pixels that are already rendered when I render a new one".
Click to expand...

It's not this simple.

Imagine two windows: a red one (1,0,0,.4) and a darker green one (0,1,0,.9) on a black background. If we use the most common blending mode "SRC_ALPHA, ONE_MINUS_SRC_ALPHA" the result will be:
(0,0,0) blend (1,0,0,.4) = .4*(1,0,0) + .6 * (0,0,0) = (.4,0,0)
(.4,0,0) blend (0,1,0,.9) = .9*(0,1,0) + .1 * (.4,0,0) = (.04,0.9,0) (bright green)

However, if the green window happens to be provided to the graphics card first:

(0,0,0) blend (0,1,0,.9) = .9*(0,1,0) + .1*(0,0,0) = (0,.9,0)
(0,.9,0) blend (1,0,0,.4) = .4*(1,0,0) + .6*(0,.9,0) = (.4,0.54,0) (dark yellow/brown-ish)

These two colors differ significantly and just because
we rendered the two windows in different orders. This has nothing to do with the z-buffer: it is the rendering order which determines the final color-value of a pixel. In order to get a correct final blending color, you need to make sure that every single translucent triangle is rendered back-to-front (and that every pair of intersecting translucent triangles is split up in non-intersecting triangles).

This limitation is rooted in the graphics hardware, which is why you will not find a feasible, generic solution to this problem anywhere.

This wiki entry explains more thoroughly:
http://www.opengl.org/wiki/Transparency_Sorting

n0mad · Oct 5, 2009

Yes, I have just finished reading the openGL wiki page.

Seems like BSP tree sorting or Radix sorting could be an answer ... I got a sample code for Radix, so would be easy to implement.

But this would mean to have a way to have full control on triangles, not objects. Is there a way to disable Unity's rendering for a specific shader, and then do that manual sorting, and finally render the final result ?

ReJ · Oct 6, 2009

n0mad said:

Radix sorting could be an answer ... I got a sample code for Radix, so would be easy to implement.
Click to expand...

No. Sorting on polygon basis (Painter's algorithm) is not an answer. For example it would not even help in your scene - you have a very large polygons (floor) intermixed with smaller polygons (stuff protruding the floor) - it is very easy to see that the same polygon will be partially both under and over some other poly.

Look at http://en.wikipedia.org/wiki/Painters_algorithm which explains the issue.

To get this problem solved in a generic way, you need to sort polygons on a per-pixel basis. That is only remotely viable with current high-end GPUs - by writing your own CUDA (RAYES-like) rasterizer or ray-tracer.

n0mad said:

But this would mean to have a way to have full control on triangles, not objects. Is there a way to disable Unity's rendering for a specific shader, and then do that manual sorting, and finally render the final result ?
Click to expand...

No it is not possible at run-time.

As I suggested, you can try sorting polygons for some specific camera orientation on the level load. Mesh API provides enough functionality to achieve that.

n0mad said:

Seems like BSP tree sorting
Click to expand...

Yes, BSP tree would work for static geometry and sorting. BSP would work because it requires source polygons to be already split in order to avoid intersections, T-junctions and "overlaps". If you can author your static geometry in the tools which would pre-process/convert it to BSPtree, then you could write your own data loader with Editor scripts and implement run-time script which traverses BSPtree and stuffs the result into Mesh.triangles - Unity would render it just fine.

But as graphics programmer I would advise you not to waste you time on BSP code, but to author your content to minimize excessive alpha-blending.

n0mad · Oct 6, 2009

Thank you very much ReJ for these advises and precisions. They are super-helpful.

I found that splitting alpha Objects from opaque ones into 2 distinct unique meshes, each one with their own optimized shader, has some very good performances. In this scene :

I have an average 10k tris, 10 animated translucent sprites and over 30 static ones.

- Frame time with one unique mesh and AlphaTesting on the whole : 28.
- Frame time with multiple small batched objects and Blending instead of AlphaTesting : 22.
- Frame time with 2 unique mehses, one for Translucent with AlphaTest, one for opaque on Geometry renderQueue : 18.

So it seems like your advice of structuring the scene manually is the best around.
The final trick would be to adjust how many translucent tris we can have in the whole scene, depending on desired frametime.

For restricted views like mine, it's easy, but for open worlds like Modern Combat from Gameloft, we can see how they never exceed a certain amount of translucent objects (alphaTesting is clearly visible in this game, but framerate is still good). I guess that would be the kind of game where BSP could come vital.

Thanks again to have taken some of your precious time to enlighten these dark waters.

Best regards

bumba · Oct 7, 2009

hi n0mad

i have some questions:

1) do the normal Transparent/Diffuse Shade use Alpha Testing? Which Shader do you use now?

2) what do you mean with one unique mesh? you combine all objects in your 3d tool? But then the triangles will increase very much i guess

n0mad · Oct 7, 2009

Hi,

bumba said:

1) do the normal Transparent/Diffuse Shade use Alpha Testing? Which Shader do you use now?
Click to expand...

Yes it does : AlphaTest Greater 0.

As I said in previous posts above, I'm using a 2 pass custom shader , with AlphaTest Greater 0.95 in first, Blending in second.
And a normal renderQueue=Geometry one pass shader for Opaque objects.

bumba said:

2) what do you mean with one unique mesh? you combine all objects in your 3d tool? But then the triangles will increase very much i guess
Click to expand...

Yes, I combine all the mesh into unique meshes. One for the Alpha objects, one for the Opaque ones. All directly in the 3D tool.

I didn't check the tri count, but I can't see why they would be risen, as it is the same polygon structure. This procedure just saves batching calculations.

bumba · Oct 7, 2009

n0mad said:

I didn't check the tri count, but I can't see why they would be risen, as it is the same polygon structure. This procedure just saves batching calculations.
Click to expand...

but it will have no difference in drawcalls , right?

n0mad · Oct 7, 2009

bumba said:

n0mad said:

I didn't check the tri count, but I can't see why they would be risen, as it is the same polygon structure. This procedure just saves batching calculations.
Click to expand...

but it will have no difference in drawcalls , right?
Click to expand...

Yes it will, as I experienced non-systematical batching success. Maybe I messed up something or else in my batching setup, but I found this method to be safer.

Pheck · Oct 8, 2009

The issue you are talking about is global to many engines. z depth on translucent polys and zdepth on polys that are close is a issue even in other engines.
I work in UE3 at work and we have similar issues, the difference is that we have a higher depth precision since its console and PC.

there is one trick that we use when z fighting happens, and i mean zfighting not because it flickers but because you are not able to control the z depth.

we will define the render order in code for polys that have this issue. for instance we might have a huge water plane and all of the effects are sorting behind when viewed from a specific angle. so we tell the water to always render first.
I dont know if a method like this could be done in Unity as we did it in C++ but there may be a similar solution in shaderlab or in camera settings.

have you tried changing your camera clip distances to help your z precision?

the comment above that there is no good method in 2010 for alpha sorting in 3d engines is a little wrong. the best method i have seen is depth peeling but even on a good pc it takes a lot of perf away. you would never see it on a iphone or anything slower than a modern pc.

n0mad · Oct 8, 2009

Hello PHeck,

polygon manual sorting was in my sight before ReJ stated that it would take too much horsepower for an iPhone game.

Yes I tried with very small clipping values, as I saw this tip in old threads in here.

I'm still amazed by how such a basic visual feature is still so hard to reach.
C'mon ... we can sort vertices to display proper faces intersections, but we can't just apply a proper depth buffer for translucency ?

If I had the time ... I will do ... things :roll:

ReJ · Oct 8, 2009

PHeck said:

we will define the render order in code for polys that have this issue. for instance we might have a huge water plane and all of the effects are sorting behind when viewed from a specific angle. so we tell the water to always render first.
I dont know if a method like this could be done in Unity as we did it in C++ but there may be a similar solution in shaderlab or in camera settings.
Click to expand...

You can do that in Unity by either:
a) setting "Queue" tag in ShaderLab shaders (example: Tags {"Queue" = "Transparent-20" })
b) setting renderQueue property on material from scripts

The downside is of course introducing more draw calls.

Search Unity

AlphaTesting unadvised in 1.1, removed in 2.0

n0mad

n0mad

n0mad

n0mad

n0mad

andeeeee

n0mad

n0mad

n0mad

imparare

n0mad

imparare

n0mad

jtbentley

n0mad

ReJ

Unity Technologies

imparare

n0mad

n0mad

tomvds

n0mad

ReJ

Unity Technologies

n0mad

bumba

n0mad

bumba

n0mad

Pheck

n0mad

ReJ

Unity Technologies

Search Unity

Unity ID

Useful Searches

AlphaTesting unadvised in 1.1, removed in 2.0

Unity Technologies

Unity Technologies

Unity Technologies