Search Unity

  1. Megacity Metro Demo now available. Download now.
    Dismiss Notice
  2. Unity support for visionOS is now available. Learn more in our blog post.
    Dismiss Notice

performance issue on texture bandwidth.

Discussion in 'General Graphics' started by yu-wan, Jan 10, 2017.

  1. yu-wan

    yu-wan

    Joined:
    Jan 4, 2016
    Posts:
    18
    Here I want to test the impact of the performance on using different texture format(uncompressed vs compressed).
    as UNITY official documents say:
    -----------------------------------------------------------------------------------------
    Use Compressed textures to decrease the size of your textures. This can resulting in faster load times, a smaller memory footprint, and dramatically increased rendering performance. Compressed textures only use a fraction of the memory bandwidth needed for uncompressed 32-bit RGBA textures.
    -----------------------------------------------------------------------------------------
    Here I do a quick experiment. I wrote a full screen blur effect and loop the sampling for many times like below:
    Code (CSharp):
    1. //fragment shader
    2.     fixed4 frag_blur(v2f_blur i) : SV_Target
    3.     {
    4.         fixed4 color = fixed4(0,0,0,0);
    5.     int _step = iterationTime;
    6.     int weight = 0;
    7.     float stride = 1;
    8.     for(int j=-(_step/2);j<=(_step/2);j++)
    9.     {
    10.         for(int k=-(_step/2);k<=(_step/2);k++)
    11.         {
    12.             color+=tex2D(_SencondaryTex, i.uv+float2(j*stride,k*stride)*_MainTex_TexelSize.xy);
    13.             weight++;
    14.         }
    15.     }
    16.          
    17.         return color/weight;
    18.     }  
    When I set the iterationTime to 50, the profiler tells the GPU time for executing the 2D image effect will take around 30ms. Here's my observation:
    1. the time is always 30ms nomatter whether I set the texture format into turecolor(umcompressed) or compressed.
    2. the larger the stride is, the more time it spends. like if I set stride into 10, the GPU time will increase to 150ms per frame.
    3.In which case the texture format(bandwitdth) will have significant impact on the performace?

    The reason I do the test is I'm writing a ray marching shader which utilize multiple time sampling and pixel stride sampling. Can anyone explain? Thank you.
     
  2. jbooth

    jbooth

    Joined:
    Jan 6, 2014
    Posts:
    5,461
    GPU's are optimized to quickly load/decompress texture data through the cache such that multiple samples near the same sample are faster. Additionally, most compression formats break the image into blocks, such that pixels near each other are near each other in memory, since GPUs often access several of those textures at once. If they didn't do this, then sampling in one direction along a texture would be faster than the other, because the locality of the data would be less on one axis than on the other.

    Given that you are reading the same texture over and over and moving along it, you're actually doing what the GPU is best designed to do and getting good cache coherency. If you were to randomly access multiple textures in random positions, you would quickly become bandwidth limited as the caching strategy breaks down.
     
    Deleted User likes this.
  3. yu-wan

    yu-wan

    Joined:
    Jan 4, 2016
    Posts:
    18
    Thank you jbooth, I follow your advice and have a random read test in specific stride on PC(GTX670). The result shows a significant difference when using compressed and uncompressed. However, using RGB16 and RGB32 doesn't show any difference in rendering time. I'm quite curious about that.
     
  4. jbooth

    jbooth

    Joined:
    Jan 6, 2014
    Posts:
    5,461
    You sure the graphics card is actually using the correct format? Put a gradient on the image and make sure it bands as it might allocate a 32bit image anyway. 4444 is a pretty old format, so I wouldn't expect this, but maybe..