Search Unity

Fastest way to find nearest enemy in front of player

Discussion in 'Scripting' started by _Petroz, Jan 27, 2012.

  1. _Petroz

    _Petroz

    Joined:
    May 13, 2010
    Posts:
    730
    I have a function which I call regularly which loops over all objects and does some math operations that are expensive. I have been trying to optimize it, and I was wondering if anyone has any ideas how I could optimize this further.

    Here is some code: (there may be minor typos because I removed some project specific class names without compiling)

    Code (csharp):
    1.     protected GameObject FindNearestTarget(float maxAngle, float maxDistance)
    2.     {
    3.         GameObject targetGO = null;
    4.        
    5.         float minDotProduct = Mathf.Cos(maxAngle * Mathf.Deg2Rad);
    6.         float targetDistanceSqrd = maxDistance * maxDistance;
    7.        
    8.         foreach(GameObject go in enemies)
    9.         {
    10.             Vector3 relativePosition = go.transform.position - transform.position;
    11.            
    12.             float distanceSqrd = relativePosition.sqrMagnitude;
    13.             if ( (distanceSqrd < targetDistanceSqrd)  
    14.                  (Vector3.Dot(relativePosition.normalized, transform.forward) > minDotProduct) )
    15.             {
    16.                 targetDistanceSqrd = distanceSqrd;
    17.                 targetGO = go;
    18.             }
    19.         }
    20.         return targetGO;
    21.     }
    I am already using time slicing to call it every X frames rather than every frame. I do not have pro so I cannot use the profiler for this. Any help or advice would be greatly appreciated. I think the most concerning thing at the moment is the 'normailzed' call because that will be doing a square root. I was thinking of caching the enemies list per character and sorting it by distance reducing the number of angle calculations. Beyond that I'm kind of out of ideas.
     
  2. Legacy

    Legacy

    Joined:
    Oct 11, 2011
    Posts:
    651
    I know in my flash game i built for the contest i used a line cast which worked very well as it would always target the closest enemy depending on your situation though im not sure how that would work for you as mine was a 3d side scrolling hack and slash game.
     
  3. jgb143

    jgb143

    Joined:
    Jun 8, 2010
    Posts:
    132
    Could you add a collider in front of your player and only compare enemies in it? Or maybe use the Manhattan distance if it doesn't need to be exact.(no square root)

    Also, is there a reason you're not using the built in distance functions?
     
  4. mgear

    mgear

    Joined:
    Aug 3, 2010
    Posts:
    9,448
  5. hippocoder

    hippocoder

    Digital Ape

    Joined:
    Apr 11, 2010
    Posts:
    29,723
    The best way would be to have a simple box collider + rigidbody set to isKinematic (need it to have a rigidbody to run faster) and attached to player. Then you use OnTriggerEnter and OnTriggerExit to build a list of objects that are in front of the player. This is a really optimised way to do it providing of course that your objects actually have rigidbodies themselves or your trigger will not work correctly.

    Of course you'll want to check the list so you do not add duplicate entries. You could use a dictionary as well.

    Ultimately the trigger technique means your list will be rather small - as Physx will have done the grunt work of checking what is even remotely in your view and range to begin with.

    Once you have the list, it is trivial to sort it. You could also just sort say 10 items a frame and so on, plenty of optimization ideas.

    Could be slower, not sure.
     
    Last edited: Jan 27, 2012
  6. _Petroz

    _Petroz

    Joined:
    May 13, 2010
    Posts:
    730
    Thanks for all the awesome feedback guys. Now that I think about it leveraging the inbuilt physics system makes a lot more sense than trying to roll my own system in managed code.

    Unfortunately my objects don't have rigid bodies at the moment, I will look into it and work out why not. I think I had a problem with nested colliders when using rigid bodies. I'll dig into it and see if I can make it work.
     
  7. mweldon

    mweldon

    Joined:
    Apr 19, 2010
    Posts:
    109
    There is a potential early-out case for your existing code. If Dot( relativePosition, transform.forward ) is < 0 then the enemy is behind the player and there is no need to do any other checks. This would avoid having to calculate the sqrMagnitude and might avoid doing a normalize. I'm assuming that the shoot angle would never be wider than a half-circle.
     
  8. _Petroz

    _Petroz

    Joined:
    May 13, 2010
    Posts:
    730
    Thanks for the suggestion mweldon. At the moment there is no restriction on the angle size but if it is < 90 then it is a worthwhile optimization. Here is the modified code:

    Code (csharp):
    1.         protected GameObject FindNearestTarget(float maxAngle, float maxDistance)
    2.         {
    3.             GameObject targetGO = null;
    4.            
    5.             float minDotProduct = Mathf.Cos(maxAngle * Mathf.Deg2Rad);
    6.             float targetDistanceSqrd = maxDistance * maxDistance;
    7.            
    8.             foreach(GameObject go in enemies)
    9.             {
    10.                 if (maxAngle < 90f  Vector3.Dot(relativePosition, transform.forward) <= 0f)
    11.                    continue; // Early out
    12.                    
    13.                 Vector3 relativePosition = go.transform.position - transform.position;
    14.                    
    15.                 float distanceSqrd = relativePosition.sqrMagnitude;
    16.                 if ( (distanceSqrd < targetDistanceSqrd)
    17.                      (Vector3.Dot(relativePosition.normalized, transform.forward) > minDotProduct) )
    18.                 {
    19.                     targetDistanceSqrd = distanceSqrd;
    20.                     targetGO = go;
    21.                 }
    22.             }
    23.             return targetGO;
    24.         }
     
    Last edited: Jan 28, 2012
  9. fholm

    fholm

    Joined:
    Aug 20, 2011
    Posts:
    2,052
    Fastest way to find nearest enemy in front of player? do a box collider that encapsulates the entire view port (or rather as much as you need) and intersect it in front of the player. Or a frustrum collider.

    If it's a straight line use a ray cast.
     
    Last edited: Jan 28, 2012
  10. npsf3000

    npsf3000

    Joined:
    Sep 19, 2010
    Posts:
    3,830
    Use System.Diagnostic.Stopwatch and tell us how many ms you are currently spending on this.

    because I'm not going to volunteer my time to help fix something that could, for all I know, take less than a ms to compute!
     
  11. Ziron999

    Ziron999

    Joined:
    Jan 22, 2014
    Posts:
    282
    I have tested these methods and rigidbodies is by far the fastest...however...it still can't handle more then 160 units in my RTS :(
     
  12. Todd-Wasson

    Todd-Wasson

    Joined:
    Aug 7, 2014
    Posts:
    1,079
    The first thing I'd do if you're not trying to redo the algorithm is to expand out all the operations and get rid of the Dot products, vector computations, etc.. There is potentially a huge amount of speed to be gained there, possibly more than coming up with some more clever algorithm (although that wouldn't hurt either). I wrote an article on this recently showing a few ways to increase speed significantly by skipping most of the method calls completely:

    http://www.performancesimulations.c...ses-in-unitys-physics-or-any-math-heavy-code/

    Just taking a quick look at your code, I'd change this:
    Code (csharp):
    1.  
    2.  Vector3 relativePosition = go.transform.position - transform.position;
    3.  
    To this:

    Code (csharp):
    1.  
    2. //Before the foreach loop, cache your transform.position so you aren't requesting the same data over and over for every game object.
    3.  
    4. Vector3 transformPosition = transform.position;
    5.  
    6. //I'd also declare the relativePosition variable right there:
    7.  Vector3 relativePosition;
    8.  
    9. //Then in your foreach loop, unwind it and compute it like this:
    10. relativePosition.x = go.transform.position.x - transformPosition.x;
    11. relativePosition.y = go.transform.position.y - transformPosition.y;
    12. relativePosition.z= go.transform.position.z - transformPosition.z;
    13.  
    I'd also compute the sqrMagnitude manually instead of using the .sqrMagnitude method. This will be way, way faster. So I'd change this:
    Code (csharp):
    1.  
    2. float distanceSqrd = relativePosition.sqrMagnitude;
    3.  
    To this:
    Code (csharp):
    1.  
    2. float distanceSqrd = relativePosition.x * relativePosition.x + relativePosition.y * relativePosition.y + relativePosition.z * relativePosition.z;
    3.  
    I'd also get rid of the .Dot call here and compute the dot manually, store it in a float and then do the comparison on that float. Instead of this:

    Code (csharp):
    1.  
    2. Vector3.Dot(relativePosition.normalized, transform.forward)
    3.  
    I'd do this:

    Code (csharp):
    1.  
    2. //Before the foreach loop, cache the transform.forward into a Vector3 like this:
    3. Vector3 transformForward = transform.forward;
    4.  
    5. //Then in the foreach loop in place of the Dot call, I'd do something like this:
    6.  
    7. Vector3 rPNormalized = relativePosition.normalized;
    8. float dot = rPNormalized.x * transformForward.x + rPNormalized.y * transformForward.y + rPNormalized.z * transformForward.z
    9.  
    Then you can do your comparison with minDotProduct using the float "dot" instead of the member function call. I'd have to profile it to be sure, but I wouldn't be surprised to see your code run 5 or 10 times faster or more by paying attention to this kind of thing. Skip method calls and overloaded math here whenever you can. You might be faster normalizing the rPNormalized variable yourself rather than using the relativePosition.normalized method call like I showed.

    I've heard that "foreach" is something to avoid because it's allocating new memory or something along those lines, but have not speed tested it so am not really sure if that's true or not. From what I read, it's supposedly slower than a regular "for" loop. Maybe somebody here can verify whether that's correct or not.

    I haven't read the whole thread, maybe others have suggested better algorithms. If so, great, but I'd still be sure to expand everything similarly to what's shown here if this is speed critical stuff.
     
    gamer_boy_81 likes this.
  13. image28

    image28

    Joined:
    Jul 17, 2013
    Posts:
    457
  14. Todd-Wasson

    Todd-Wasson

    Joined:
    Aug 7, 2014
    Posts:
    1,079
    If you're able to get away with only computing the distances every few frames like that, it might be worth the trouble to spread out the computations so it does groups of objects every frame instead. This way you avoid spiking every few frames which might keep things smoother. I don't know if that's a potential problem in your case, it may not matter for you.

    In other words, if you have 100 objects, you compute the distances to 10 of them the first frame, the next 10 on the second frame, etc.. Something like that. My particle system in VRC Pro worked like that, I think it was splitting everything up into 8 or so groups and only updating one of them every physics cycle. Granted, my physics frequency was 250 or 500 depending on the car so it was doing a lot more updates than most games do. The particles weren't updated once per graphics frame, it was once per physics frame instead.

    If that's something you wanted to try, maybe you could change your function to be something like this:

    Code (csharp):
    1. FindTargetDistances (int startObjectIndex, int endObjectIndex, float maxDistance)
    2. {
    3. //Do stuff
    4. }
    5.  
    startObjectIndex and endObjectIndex could be indices into an array of object positions so you could compute distances to object 0-9 on one frame, 10-19 on the next, etc., in a regular "for" loop (instead of "foreach") using the startObjectIndex and endObjectIndex values as the indices into the array. Then every few frames, after you've computed all of the distances, you could then quickly scan through the array and find the closest one. Or probably even better would be to just keep track of the closest object and distance while you're doing the groups themselves, remembering to reset it all as soon as you've cycled over all of them. It depends how accurate you want it to be and how important it is that you really got the closest one 100% of the time. Maybe in your case something like this would be enough.

    Square root: I just tested this and Mathf.Sqrt() was indeed about 20 times slower than an addition, so you're right to be concerned with speed there. This surprised me, my understanding was that square roots are pretty fast on today's hardware, not that much different than doing a regular math op. To me it looks like most of that speed difference is due to the overhead of the method call itself.

    To get some idea of that I also tried an overloaded addition on a Vector3 which includes a method call. The square root was only twice as slow as that. Granted, this is comparing adding three numbers together rather than just one, so I'm not too sure. My gut tells me that most of the time is due to the overhead of the method call.

    Anyway, here's the test code I profiled along with results at the end for what it's worth:

    Code (csharp):
    1. void TestAddition()
    2.     {
    3.         float a = 1;
    4.         float b = 2;
    5.         float c;
    6.        
    7.         for (int i = 0; i < numberOfLoops; i++)
    8.         {
    9.             c = a + b;
    10.         }
    11.     }
    12.      void TestAdditionOverloadVector3()
    13.     {
    14.         Vector3 v0;
    15.         Vector3 v1;
    16.         Vector3 v2;
    17.  
    18.         v0.x = 1;
    19.         v0.y = 2;
    20.         v0.z = 3;
    21.  
    22.         v1.x = 4;
    23.         v1.y = 5;
    24.         v1.z = 6;
    25.  
    26.         for (int i = 0; i < numberOfLoops; i++)
    27.         {
    28.             v2 = v0 + v1;
    29.         }
    30.     }
    31.     void TestSquareRoot()
    32.     {
    33.         float a = 123;
    34.         float b;
    35.  
    36.         for (int i = 0; i < numberOfLoops; i++)
    37.         {
    38.             b = Mathf.Sqrt(a);            
    39.         }
    40.     }
    Results for numberOfLoops = 40000:
    TestAdditionOverloadVector3() -> 6.06 ms
    TestSquareRoot() -> 3.17 ms
    TestAddition() -> 0.16 ms

    The square root is actually twice as fast as doing an addition on two Vector3s through the overload operation that involves a method call. 3 additions in the overloaded case takes twices as long as 1 square root then, so if it's reasonable to take the addition test time down to 1/3rd (2.02 ms instead of 6.06ms) maybe you could say the square root is only 56% slower than a single addition? I.e., the square root is like doing 1.5 additions then if you're comparing an overloaded math operation with a square root done through the Mathf.Sqrt() method. So it's not really computing the square root itself that's so slow, it's the method call overhead. You're doing that anyway every time you add or subtract Vector3's using an overload.

    If you're doing most of your math with the overloads anyway which it appears you are, then the square root isn't really that big of a deal in comparison and I wouldn't get too worked up about it. The big gains come from expanding the code to get rid of the method calls. If you're unrolling everything for speed like I normally do (see the TestAddition() function), then the square root is something like 20 times slower than an addition. Then the square root is kind of a big deal.

    Basically what I'm saying is that I would advize somebody that's unrolling their math to skip square roots, but if they're just using Vector2 or Vector3 overloads and Dot() and so forth for their math anyway, a couple of square roots here and there isn't that big of a deal because it's going to be a lot slower anyway. Then it's like putting lipstick on a pig, there's not much point.

    By the way, in another test not shown here, the Dot() function (3.02 ms) and the Sqrt() (3.06 ms) function were almost the same.

    Normalizing:

    Here are two tests, one uses the normalize() method and the other is expanded code with no method calls aside from the necessary Mathf.Sqrt() call. I had to reduce the numberOfLoops here to 10000 (25% of the original value), so don't equate these numbers directly to the previous tests:

    Code (csharp):
    1.     void TestNormalizeMethodCall()
    2.     {
    3.         Vector3 v0;
    4.         Vector3 vNormalized;
    5.  
    6.  
    7.         v0.x = 1;
    8.         v0.y = 2;
    9.         v0.z = 3;
    10.  
    11.  
    12.         for (int i = 0; i < numberOfLoops; i++)
    13.         {
    14.             vNormalized = v0.normalized;
    15.         }
    16.  
    17.     }
    18.  
    19.     void TestNormalizeNoMethodCall()
    20.     {
    21.         Vector3 v0;
    22.         Vector3 vNormalized;
    23.  
    24.         v0.x = 1;
    25.         v0.y = 2;
    26.         v0.z = 3;
    27.  
    28.         for (int i = 0; i < numberOfLoops; i++)
    29.         {
    30.             float vLengthSquared = v0.x * v0.x + v0.y * v0.y + v0.z * v0.z;
    31.  
    32.             float vLength = Mathf.Sqrt(vLengthSquared);
    33.  
    34.             if (vLengthSquared != 0)
    35.             {
    36.                 vNormalized.x = v0.x / vLengthSquared;
    37.                 vNormalized.y = v0.y / vLengthSquared;
    38.                 vNormalized.z = v0.z / vLengthSquared;
    39.             }
    40.         }
    41.     }
    42.  
    Results at numberOfLoops = 10000:
    TestNormalizeMethodCall() -> 4.45 ms
    TestNormalizeNoMethodCall() -> 0.88 ms

    So even with the square root in the expanded, NoMethodCall version, it ran 5 times faster than using the .normalized property which is no doubt doing a square root method call internally plus the overhead of the .normalized method call itself. So it's not so much the square root in the operation that's slowing it down, it's probably the overhead of the .normalized call itself that's to be avoided when speed really matters.

    In cases where you can be sure the vector lengths are never 0, the "if" statement checking for the 0 length vector can be removed which speeds it up a little bit more. It's not much though (a few percent) so I wouldn't do that unless you really needed to wring out every last little bit. That's a bug waiting to happen though, so probably isn't worth it.

    If you're comparing enough objects, you might get even more speed by plopping all the positions into an array like described at the first part of this post just because it should be more CPU cache friendly. When you're grabbing positions out of objects scattered around in memory it can slow things down a lot. It might be worth looking at copying the positions into an array, then passing that array into the function and doing all the comparisons there in a small, simple, CPU cache friendly function. With the numbers all packed together in memory in sequential order you might get a significant speedup on top of the rest of the stuff.

    Having said all that, I don't know how many objects you're dealing with here. If this part of your code is only taking 0.01 ms right now, it's not worth any of this trouble because the user won't ever notice the difference anyway.
     
    Last edited: Feb 4, 2015
    tomicz likes this.