Search Unity

  1. Megacity Metro Demo now available. Download now.
    Dismiss Notice
  2. Unity support for visionOS is now available. Learn more in our blog post.
    Dismiss Notice

Faster integer square root than this?

Discussion in 'Scripting' started by yoonitee, Jan 28, 2015.

  1. yoonitee

    yoonitee

    Joined:
    Jun 27, 2013
    Posts:
    2,363
    I need to find the nearest integer square root of a number between 0 and 0x10000 (65536 or 2^16). Capped at 0 and 0x100 (256). I need to do this thousands of times per frame! Using (int)Mathf.Sqrt(x) isn't going to cut it. It needs to be inline. Not a function call.

    Obviously the easiest way is to create a byte array

    Code (CSharp):
    1. byte[] squareRoot = new byte[65536];
    and initialise it with all the square roots between 0..65536. e.g. with squareRoot[16]=4 up to squareRoot[0xFFFF] = 0x100 (roughly). It doesn't really matter if it's one out.

    So:

    Code (CSharp):
    1. byte sqrt_of_x = x<0 ? 0 : ( x>= 0x10000 ? 0x100 : squareRoot[x] );
    Only problem is this a size-able chunk of memory (64kb). And a fair bit of time to loop through to initialize the array. (well a few milliseconds anyway!)

    Can you think of way where I can use a smaller array?
     
    Last edited: Jan 28, 2015
  2. toreau

    toreau

    Joined:
    Feb 8, 2014
    Posts:
    204
    I'm intrigued. Why?
     
  3. Kiwasi

    Kiwasi

    Joined:
    Dec 5, 2013
    Posts:
    16,860
    I'd do some heavy benchmarking on this before committing. Modern CPU speeds are much faster then modern memory speeds. So calculating the square root may be faster then looking it up in memory.

    How accurate do you need it? There are a bunch of approximations to the square root function that will hold over a limited range. You could even build a regular polynomial that's somewhat accurate across the range.
     
    Bunny83 likes this.
  4. yoonitee

    yoonitee

    Joined:
    Jun 27, 2013
    Posts:
    2,363
    It's for rendering a radial gradient for something I'm making. For certain reasons it must be procedural not a texture. Thus it needs the distance from each pixel to the origin. Hence needs the square root of (x^2+y^2). But anyway, it's an interesting programming problem! It doesn't need a use! ;) I set it as a challenge then. To find a solution to this problem. For numbers 0...0x10000, to find to within one integer the square root and always increasing. I think look-ups should be allowed for an array up to 256 in size. Using (int)Mathf.Sqrt doesn't give amazing performance! (Using Unity.Mathf or System.Math) What is the shortest one-line algorithm?

    I've seen this but not sure if this can be adapted for this? As I need the sqrt not the inverse sqrt.

    e.g. a good approximation for integers 0..64 which is within one integer is (int)((x+8)/8). Perhaps it's good enough to approximate the square root function with lines?
     
    Last edited: Jan 28, 2015
  5. toreau

    toreau

    Joined:
    Feb 8, 2014
    Posts:
    204
    You are definitely on the track if you precalculate the numbers in question and then vector them against any incoming number using your own specific algorithm or something like a Rainbow table (or similar).

    Bottom line is that this:

    ...won't ever help us help you. :)
     
  6. yoonitee

    yoonitee

    Joined:
    Jun 27, 2013
    Posts:
    2,363
    It's for a graphics library. I'm just optimising it at the moment. Shouldn't think it will stand or fall on the speed of rendering radial gradients but just in case I'm trying to oil every last bolt of it! Also, there are other situations such as physics simulations where a fast square root would be useful. Since there are 256 levels of colour this is the accuracy I need. But within one integer probably won't be noticeable(??)
     
  7. meatpudding

    meatpudding

    Joined:
    Jan 28, 2015
    Posts:
    39
    How about something like
    Code (csharp):
    1.  
    2. int squareRoot = 1;
    3. int n = 1;
    4. for (n = 1; n < 0x10000; ++n) {
    5.   while (squareRoot *squareRoot < n) ++sqrt;
    6.   cache[n] = squareRoot ;
    7. }
    8.  

    Edit: Oh, you're bothered about the size of the array, not the speed of precaching. Maybe you could do something with Run Length Encoding.
     
  8. yoonitee

    yoonitee

    Joined:
    Jun 27, 2013
    Posts:
    2,363
    Yep that's the one I thought of but I wanted to avoid using all 64kb of memory.

    For example I for a square root of an integer 0..256. I found this one:

    Code (CSharp):
    1. int y; //(a number from 0..256)
    2. int sqrt_y =  ( ( (7+(y / 7) ) / 2 ) + (y / ( (7+(y / 7) ) / 2) ) ) / 2 ;
    Based on Newton's method. Gives the correct square root for 0..256. But uses 9 binary operations so not sure how quick it is! I initialised it with 7 as this is an average value for the result which is in the domain 0..16 which gave the best result.

    Considering that there are only 17 numbers, you could actually get the answer in 4 compares!

    You could get the answer for the 0..0x10000 case in 16 compares using ?: notation but it would be a very long line! And I'm not sure if ? : is quicker than division.

    I suppose I could optimise the above with:

    Code (CSharp):
    1. int y; //(a number from 0..256)
    2. int t = (7+ y / 7 ) / 2;
    3. int sqrt_y =  ( t  + y / t  ) / 2 ;
    Which reduces it to 6 binary operations.

    Although this:
    Code (CSharp):
    1. sqrt_y = y<64 ? (
    2.    y<16 ? (
    3.        y<4 ? ( y==0?0:1 ) : (y<9?2:3)
    4.    ) :  (
    5.      y<36 ? ( y<25?4:5 ) : (y<49?6:7 )
    6.   )
    7. ) : (
    8.    y<16 ? (
    9.        y<100 ? ( y<81?8:9 ) : (y<9?10:11)
    10.    ) :  (
    11.      y<196 ? ( y<169?12:13) : (y<225?14:15 )
    12.   )
    13. );
    has only 4 comparrisons but might be slowed down by the branching.

    For the case of y=0..0x10000 this works:
    Code (CSharp):
    1. int y;// y=0..0x10000
    2. int t = (25+y/25)/2;
    3. t = (t+y/t)/2;
    4. t = (t+y/t)/2;
    5. t = (t+y/t)/2;
    6. int sqrt_y = y<0?0:(y>=0x10000:0x100: (t+y/t)/2) ;
    7.  
    (The 25 just seems to be the best number to start off the iterations.)
    I will test it to see if it is quicker than calling (int)Mathf.Sqrt(y). Any better ideas? This one uses 15 binary operations not including storing the values.

    TBH. I don't know whether optimizing the square root is going to make much of a difference to my project. But now I'm just interested to find the answer for the sake of it!
     
    Last edited: Jan 28, 2015
  9. yoonitee

    yoonitee

    Joined:
    Jun 27, 2013
    Posts:
    2,363
    I have found a simpler way to get the square root of an integer, y, from y=1..350:
    Code (csharp):
    1.  
    2. int sqrt_y   =  1 + (  y/14  + y/(2+y/14) )/2;
    3.  
    using just 7 binary operations. Or this one works for y=0..350:

    Code (csharp):
    1.  
    2. int sqrt_y   =  ( (3+y/7)  + (4*y)/(3+y/7) )/4;
    3.  
     
    Last edited: Jan 29, 2015
  10. wideeyenow_unity

    wideeyenow_unity

    Joined:
    Oct 7, 2020
    Posts:
    728
    I had a similar issue, and thought I had an equation that worked within my notes, but can't seem to find it. So I took this problem to heart to figure a better way:

    Code (CSharp):
    1. float SquareRoot(float num)
    2.     {
    3.         for (int i = 0; i < num; i++)
    4.         {
    5.             if (i * i > num)
    6.             {
    7.                 float squareTotal = (i - 1) * (i - 1);
    8.                 float remainder = (num - squareTotal) / (i - 1) / 2;
    9.                 return (i - 1) + remainder;
    10.             }
    11.         }
    12.         return 0; // exempt
    13.     }
    [Edited]
    However, after arguing with AI, it gave me a better method, saying that my method was a childlike version of the Babylonian method. So the more performant version is:

    Code (CSharp):
    1. float SquareRootAI(float num)
    2.     {
    3.         float guess = num / 2;
    4.         float prevGuess = guess + 1;
    5.         while(guess != prevGuess)
    6.         {
    7.             prevGuess = guess;
    8.             guess = (num / guess + guess) / 2;
    9.         }
    10.         return guess;
    11.     }
    So after dealing with hurt feelings, I then benchmarked the previous methods versus unity's built in Mathf.Sqrt() function. Knowing that benchmarking always gives a false first result, I switched the order of each the functions, and tested (21) and (800) roots.

    Unity's built in function was always the fastest, ai's code was second, and my code came in dead last. Depending on the number to square, ai code was twice as fast as mine, and unity's code was twice as fast as the ai code.

    So future advice to new coders, as of 2023, Unity's built in Sqrt is the best way as far as tick's in performance.
     
    Last edited: Jul 17, 2023