Search Unity

Fast and cheap way of doing blur

Discussion in 'Shaders' started by MaT227, Aug 18, 2015.

  1. MaT227

    MaT227

    Joined:
    Jul 3, 2012
    Posts:
    628
    Hey everyone,

    I am wondering how the blurring of the reflections probes used in the Standard shaders is done.

    I don't think that they are using a blur pass but maybe the mipmaps. But if the mipmaps are used how can they get rid of the pixelated effect of the mips ?
    Thanks a lot !
     
  2. MaT227

    MaT227

    Joined:
    Jul 3, 2012
    Posts:
    628
    Here is an example of unity blurring. There's no pixelated effect but if you look at the shader code, they are switching between mips.



    From what I've tested the results are quite pixelated on my side, but is there a way to attenuate this ? How ?

    To finish and extend the question, is there any other fast way to do a simple blur ? Thanks a lot.
     
  3. jistyles

    jistyles

    Joined:
    Nov 6, 2013
    Posts:
    34
    As you may know, In PBR, to simulate the chaotic effect of micro-facet/surface roughness effectively enlarging the sampled region of the reflective lobe, we convolve the high res reflection data to get a close approximate match to these lobe results at different points along the power scale.

    Because the nature of this convolution results in visually lower frequency maps (eg stuff looks blurry 'cause we have a wider area sampled), these data points can be stored off as mipmaps without losing detail.

    tl;dr: the reflection mip maps are not normal mip maps. They represent highly processed pre-computed data. This data represents progressively wider sampled reflective lobes with energy conservation applied, so in general they look blurrier than regular mipmaps, and high energy bright bits stay brighter for longer as you go down the mip chain.

    Further reading:
    https://seblagarde.wordpress.com/2012/06/10/amd-cubemapgen-for-physically-based-rendering/
    http://www.codinglabs.net/article_physically_based_rendering.aspx
    http://dontnormalize.me/tag/pbr/


    In reply to your core question of if there is any other fast way to do a simple blur, the answer is it depends what you're wanting to blur, and what you consider is fast vs. simple.
     
  4. MaT227

    MaT227

    Joined:
    Jul 3, 2012
    Posts:
    628
    @jistyles Thank you for your precise answer and for the links.

    I thought about mips because in the standard shader code, I discovered that the roughness was made by switching between the mips.
    I also thought that there was a high convolution process for for pre-calculated probes but I didn't know that there was such kind of computation for realtime probes.

    Anyway my question was maybe a bit away from my current usage of blur as I would like to implement a fast way of blurring for an image effect.
    I discovered that I could generate mips for my rendertexture and then play with the mips and the bias to blur it. This leads to my other question about optimization.

    If you compare those types of blur what might be the most optimized for an image effect.
    • Mips with triplanar and bilinear filtering on downsampled rendertexure.
    • The Blur system used in the standard package
    • The Blur optimized used in the standard package
    • Box filtering
    • Or any other solution.

    Thank a lot.
     
  5. jistyles

    jistyles

    Joined:
    Nov 6, 2013
    Posts:
    34
    Bit of a warning up front, kernel filtering is a BIG subject :)

    Unfortunately, generating mip maps can be quite heavy lifting for this use -- it's ALWAYS better to do your own limited downsampling to map to your own use case (for instance you may be able to get away with a single pass which downsamples to 1/8th resolution with 6 texture reads, instead of mipmapping always decimitating down the whole power chain step by step 'til it reaches 1x1 pixel with 4 read equivalents each).

    If it's all about blur performance, for an image effect, then it's all about reducing the total texture reads across all pixels (or if on a compute shader, increasing concurrency / abusing parellelism).

    For a naive solution, you can figure out your sample count with some simple math (this does overly simplify it all, but it does give a good comparative idea):
    sampleCount = ScreenWidth*ScreenHeight*samplePerpixel
    That usually ends up as "sampleCount = A LOT" if you try to simply apply a kernel to a full resolution buffer in a single pass. But if for example you downsample first with a single bilinear read, then your math becomes:
    sampleCount = (0.25*ScreenWidth*ScreenHeight) + (0.25*ScreenWidth*ScreenHeight*samplePerpixel)

    That's worse case slightly more than quarter cost. That's exactly what the "Blur Optimized" does (hopefully in addition to a few other things).

    So there's LOTS of further tricks to reduce this:
    - emulate more expensive kernels with sum's of cheaper ones -- a common example is to use multiple box blurs visually looks quite similar to a gaussian.
    - downsample the screen size before applying your blur scheme (so you're operating on a smaller version)
    - use a downsample chain to compute large kernels (kawase method is my personal fave)
    - do a seperable blur (eg, blur in 1D for x, then use the result to blur in 1D for Y)
    - abuse bilinear filtering to spread weights and reduce samples required

    For larger kernels, or dynamically changing kernels (eg, huge blur radius, or blur dimensions need to change per pixel), there's schemes such as summed area tables which can help here.

    Here's some more info:
    https://software.intel.com/en-us/bl...ast-real-time-gpu-based-image-blur-algorithms
    https://fgiesen.wordpress.com/2012/07/30/fast-blurs-1/
    http://stackoverflow.com/questions/4690756/separable-2d-blur-kernel
     
  6. MaT227

    MaT227

    Joined:
    Jul 3, 2012
    Posts:
    628
    @jistyles Thank you very much for your complete answer ! It really helps me understanding the whole process. :)
     
  7. MaT227

    MaT227

    Joined:
    Jul 3, 2012
    Posts:
    628
  8. jistyles

    jistyles

    Joined:
    Nov 6, 2013
    Posts:
    34
    When doing it manually, avoid writing into mips -- just use a smaller temporary target (or allocate a dedicated target if you're managing your own memory and re-use).
    For this type of operation and to keep it rather compact, I'd do something like the following:
    1) create two 1/4th size targets, lets call them SmallA and SmallB
    2) downsample to 1/8th size from your full screen buffer into SmallA in a single pass (reading 2x2 bilinear filtered samples at half texel offsets so you effectively get 4x4 averaged texel samples)
    3) Read from SmallA and write into SmallB, do a blur in the X direction with gaussian weights, let's say 8 sample blur radius (I'd be using the bilinear filter weight trick here to convert gaussian weights into pixel offsets)
    4) Read from SmallB and write into SmallA, do the same but in the Y direction.

    That'll give you a cheap, fairly large blur, with minimal steps.
     
  9. MaT227

    MaT227

    Joined:
    Jul 3, 2012
    Posts:
    628
    @jistyles Thank you for all your advices !