8

I am needing to do a lookup on a large bool set using x,y,z coordinates(integers only). I am trying to determine if a block is traversable at x,y,z location.

Currently I am using an array bool[,,]. However, this is not very flexible, and the size quickly gets massive for a decent sized map.

I thought a dictionary that only holds the true values would be much more flexible and less memory hungry. I created a struct Vector3Int to hold the x,y,z values and use as a dictionary key. Dictionary looks like Dictionary<Vector3Int, bool>.

However, doing Dictionary lookups with this key appears to be 20-100 times slower than the array lookups with x,y,z integers.

Is there a faster/better way to do this? I am using the lookups for pathfinding, so the lookup needs to be quite fast as there may be hundreds/thousands of lookups for a single path.

Vector3Int code:

public struct Vector3Int
{
public int x,y,z;

public Vector3Int(int x, int y, int z)
{
    this.x =x;
    this.y=y;
    this.z=z;
}

//checks for equality
public static bool operator ==(Vector3Int a, Vector3Int b)
{
    return a.x==b.x && a.y ==b.y && a.z ==b.z;
}

public override bool Equals(object obj)
{
    return obj is Vector3Int && this == (Vector3Int)obj;
}

public override int GetHashCode ()
{
    return x.GetHashCode() ^ y.GetHashCode() ^ z.GetHashCode();
}

//checks the two vectors for non-equality
public static bool operator !=(Vector3Int a, Vector3Int b)
{
    return !(a==b);
}
}

Array lookup:

bool[,,] worldWalkable= new bool[1000,1000,1000];

public bool CheckWalkable(int x, int y, int z)
{
    return worldWalkable[x,y,z];
}

Vs Dictionary Lookup:

Dictionary<Vector3Int, bool> worldTraversable = new Dictionary<Vector3Int, bool>();

public bool CheckTraversable(Vector3Int block)
{
    bool tempBool;
    worldTraversable.TryGetValue(block, tempBool);
    return tempBool;
} 

Test code:

using UnityEngine;
using System.Collections;
using System.Collections.Generic;

public class SpeedTest : MonoBehaviour {

Dictionary<Vector3Int, bool> testDict;
bool[,,] testArray;

// Use this for initialization
void Start () 
{
    InitializeContainers();
    RunTest();

}

void InitializeContainers()
{
    testDict= new Dictionary<Vector3Int, bool>(1000);
    for(int i=0; i<10; i++)
    {
        for(int j=0; j<10; j++)
        {
            for(int k=0; k<10; k++)
            {
                testDict[new Vector3Int(i,j,k)]=true;
            }
        }
    }

    testArray=new bool[10,10,10];
}

void RunTest()
{
    bool tempBool;
    float timer1, timer2;

    timer1=Time.realtimeSinceStartup;
    for(int i=0; i<10; i++)
    {
        for(int j=0; j<10; j++)
        {
            for(int k=0; k<10; k++)
            {
                tempBool= testDict[new Vector3Int(i,j,k)];
            }
        }
    }
    timer2=Time.realtimeSinceStartup;
    Debug.Log("Dictionary completion time: "+(timer2-timer1)*1000+"ms");

    timer1=Time.realtimeSinceStartup;
    for(int i=0; i<10; i++)
    {
        for(int j=0; j<10; j++)
        {
            for(int k=0; k<10; k++)
            {
                tempBool=testArray[i,j,k];
            }
        }
    }
    timer2=Time.realtimeSinceStartup;

    Debug.Log("Array completion time: "+(timer2-timer1)*1000+"ms");
}
}

Edit: Revised struct that fixes the problem:

using UnityEngine;
using System.Collections;
using System;
using System.Collections.Generic;

//makes a Vector3 of integers
public struct Vector3Int : IEquatable<Vector3Int>
{
public int x,y,z;

public Vector3Int(int x, int y, int z)
{
    this.x =x;
    this.y=y;
    this.z=z;
}

public bool Equals(Vector3Int other)
{
    return other.x==this.x && other.y==this.y && other.z==this.z;
}

public static bool operator ==(Vector3Int a, Vector3Int b)
{
    return a.Equals(b);
}

public override bool Equals(object obj)
{
    if(obj==null || !(obj is Vector3Int))
        return false;

    return Equals((Vector3Int)obj);
}

public override int GetHashCode ()
{
    return x ^ (y<<10) ^ (z<<10);
}

public override string ToString ()
{
    return string.Format ("("+x+", "+y+", "+z+")");
}

public static bool operator !=(Vector3Int a, Vector3Int b)
{
    return !(a==b);
}
}

The issue was boxing/unboxing like Alexei suggested. Used the directions from this link to fix the issue.

Community
  • 1
  • 1
Jeremy Diaz
  • 103
  • 1
  • 1
  • 6
  • the lookup time of a normal array is always the fastest: `O(1)` – Jibbow Apr 20 '15 at 16:46
  • And just a guess (but this really depends on the complexity of the map): You could store the coordinates of the `false` values in an `array` (ideally sorted) an just check whether the coordinate is in it or not. This would consume less space but is a little slower – Jibbow Apr 20 '15 at 16:52
  • You had a sparse-array and you traded memory for speed. Sounds about right. Your implementation of GetHashCode() looks pretty good too. – Moby Disk Apr 20 '15 at 16:53
  • 1
    @Jibbow That is a good idea: Maybe HashSet or SortedSet would be a better fit than Dictionary, if all the OP needs is "yes, in the set" or "no, not in the set" – Moby Disk Apr 20 '15 at 16:54
  • Are you accessing the points randomly, or in some order? An array that size doesn't seem like it would fit in the cache. But if you are traversing a lot of adjacent points then the array will perform really really well. Maybe another structure would be better suited if you know something about the access pattern. – Moby Disk Apr 20 '15 at 16:56
  • Side note: deleting your previous version of this question is not the right way to update question on SO... – Alexei Levenkov Apr 20 '15 at 17:01
  • Please show code that demonstrates 20-100x time difference. Code you've posted makes sense (also it does not compile as is) and should not cause such slowness. – Alexei Levenkov Apr 20 '15 at 17:08
  • 1
    Dictionary or Hashset will not compare to `return worldWalkable[x,y,z];` Few other ideas which come to mind is, use a bitarray, or a sparse bitarray to store the information. – Vikas Gupta Apr 20 '15 at 17:14
  • @Jibbow What difference would it make if the array is sorted? I am accessing the array/dictionary in a fairly random fashion. – Jeremy Diaz Apr 20 '15 at 18:19
  • @Alexei How do you remove the duplicate tag? I changed the parts of my post that needed to be changed, but there was no way to remove the duplicate flag that I could see. Also, the code compiles fine for me- using Monodevelop with Unity. – Jeremy Diaz Apr 20 '15 at 18:21
  • @JeremyDiaz http://stackoverflow.com/help/reopen-questions - Edit should have put it in re-open queue and if edit addressed the concerns you'd quickly get it reopened (http://meta.stackoverflow.com/questions/256567/which-edits-push-closed-questions-to-the-reopen-review-queue/256572#256572). In that particular case you could have just added comment to me and I'd reopen the question (probably)... Note that you still have not shown your benchmarking code... – Alexei Levenkov Apr 20 '15 at 18:29
  • @Alexei Added the benchmark code I used. The size of both the array and the dictionary are smaller than they should be, (takes forever to run on larger sizes). – Jeremy Diaz Apr 20 '15 at 18:54
  • @JeremyDiaz note that your benchmark is somewhat dangerous as you have dictionary entry for each element which is not where it would be useful. With memory overhead of dictionary you probably will see any value of dictionary at 10% or less of used cells. Also consider `HashSet` if you strictly need bool values. – Alexei Levenkov Apr 21 '15 at 14:56

4 Answers4

6

20-100 difference sounds about right for Dictionary vs. array. It depends on how big the data set is and whether it fits in the CPU cache. Dictionary does a lot more than to execute a single load instruction plus some address calculation. It is O(1), too, but with a higher constant factor.

You can go a few times (like 3x) faster than the built in Dictionary by implementing your own hash table and removing stuff you don't need. Especially the % operation that it uses is surprisingly slow. When you replace it with a bitmask this immediately show up on profiles as an improvement.

Also note that your hash function is likely to generate a lot of collisions. For example all permutations of the same coordinates hash to the same slot. I'd try rotate(x, 23) ^ rotate(y, 11) ^ z.

usr
  • 168,620
  • 35
  • 240
  • 369
4

According to How is GetHashCode() implemented for Int32? the int hashcode is the int itself. So your Vector3Int hashcode generates a large number of collisions. You can write a better distributed hashcode for your vector to resolve that problem. See What is the best algorithm for an overridden System.Object.GetHashCode? for more information on that.

Community
  • 1
  • 1
Sign
  • 1,919
  • 18
  • 33
3

Frequently you can significantly speed up operations on structs by using proper generic methods that remove need for boxing/unboxing.

In case of Dictionary implementing IEquatable<T> is good starting point:

public struct Vector3Int : IEquatable<Vector3Int>
{
    public int x,y,z;

    public bool Equals(Vector3Int other)
    {
       return  x == other.x && y == other.y && z == other.z;
    }

    public override int GetHashCode ()
    {
        return x^ (y <<5)^ (z << 5);
    }
    ....

Note that you still need decent GetHashCode to avoid collisions. If your ranges are limited (i.e. 0-1024) you may get away with very basic <<10 to get perfect hashing function. Otherwise check out other links provided like What is the best algorithm for an overridden System.Object.GetHashCode?

Community
  • 1
  • 1
Alexei Levenkov
  • 98,904
  • 14
  • 127
  • 179
  • Do you mean x<1024, y<1024, z<1024? or x*y*z<1024? – Jeremy Diaz Apr 21 '15 at 14:44
  • @JeremyDiaz yes - 32/3 is about 10, so to fit 3 integers values into single `int` each of them need to have at most 10 bits (hence 0-1024 range). It would only matter if you really need to optimize for particular numbers, using good general hash function (possibly inlined) is safer than hardcoding special knowledge... – Alexei Levenkov Apr 21 '15 at 14:51
  • Implemented IEquatable into my struct, caused a dramatic increase in dictionary speed. Thank you Alexei, that worked great! I did add another override according to link in my last edit for object.equals. – Jeremy Diaz Apr 22 '15 at 16:42
1

The lookup of false values in bool[,,] can't be done any faster as it has a compexity of O(1). But as a counter argument it consumes a lot of space.

If your map doesn't has too many false "places", another approach would be storing just the coordinates of the false values and always look up whether this list contains this coordinate or not. For increasing the lookup time of this list you could use a HashSet<T> which also has O(n) but consumes a little more space than a sorted list or array and is a little slower (some hashes have to be computed multiple times). The other way you could store the false values is in a sorted kind of list, which gives you the best memory consumption but the lookup time is worse compared to the other solutions (O(log n)). It should be sorted because you are searching values. Searching in an unsorted array would take O(n)

Jibbow
  • 757
  • 1
  • 8
  • 17