1

I have a double[] array, i want to use it as key (not literally, but in the way that the key is matched when all the doubles in the double array need to be matched)

What is the fastest way to use the double[] array as key to dictionary?

Is it using Dictionary<string, string> (convert double[] to a string) or anything else like converting it

william007
  • 17,375
  • 25
  • 118
  • 194

4 Answers4

3

Given that all key arrays will have the same length, either consider using a Tuple<,,, ... ,>, or use a structural equality comparer on the arrays.

With tuple:

var yourDidt = new Dictionary<Tuple<double, double, double>, string>();
yourDict.Add(Tuple.Create(3.14, 2.718, double.NaN), "da value");

string read = yourDict[Tuple.Create(3.14, 2.718, double.NaN)];

With (strongly typed version of) StructuralEqualityComparer:

class DoubleArrayStructuralEqualityComparer : EqualityComparer<double[]>
{
    public override bool Equals(double[] x, double[] y)
    {
        return System.Collections.StructuralComparisons.StructuralEqualityComparer
            .Equals(x, y);
    }

    public override int GetHashCode(double[] obj)
    {
        return System.Collections.StructuralComparisons.StructuralEqualityComparer
            .GetHashCode(obj);
    }
}

...

var yourDict = new Dictionary<double[], string>(
    new DoubleArrayStructuralEqualityComparer());

yourDict.Add(new[] { 3.14, 2.718, double.NaN, }, "da value");

string read = yourDict[new[] { 3.14, 2.718, double.NaN, }];

Also consider the suggestion by Sergey Berezovskiy to create a custom class or (immutable!) struct to hold your set of doubles. In that way you can name your type and its members in a natural way that makes it more clear what you do. And your class/struct can easily be extended later on, if needed.

Jeppe Stig Nielsen
  • 60,409
  • 11
  • 110
  • 181
  • It seems the tuple approach is faster. But the problem is the size of the array is not fixed beforehand, it is a parameter input by the user. Can var yourDidt = new Dictionary, string>(); be created at runtime, based on the size of double array I have? say if double array is of size four, then I need var yourDidt = new Dictionary, string>(); – william007 Feb 25 '14 at 09:39
  • @william007 No, unfortunately you cannot use `Tuple` if the length is not known at compile-time. In that case, use the array solution. – Jeppe Stig Nielsen Feb 25 '14 at 10:00
  • No problem, I have found another way to make it even faster, which I have posted in the comment. – william007 Feb 25 '14 at 10:07
2

Thus all arrays have same length and each item in array have specific meaning, then create class which holds all items as properties with descriptive names. E.g. instead of double array with two items you can have class Point with properties X and Y. Then override Equals and GetHashCode of this class and use it as key (see What is the best algorithm for an overriding GetHashCode):

Dictionary<Point, string>

Benefits - instead of having array, you have data structure which makes its purpose clear. Instead of referencing items by indexes, you have nice named property names, which also make their purpose clear. And also speed - calculating hash code is fast. Compare:

double[] a = new [] { 12.5, 42 };
// getting first coordinate a[0];
Point a = new Point { X = 12.5, Y = 42 };
// getting first coordinate a.X
Community
  • 1
  • 1
Sergey Berezovskiy
  • 232,247
  • 41
  • 429
  • 459
1

[Do not consider this a separate answer; this is an extension of @JeppeStigNielsen's answer]

I'd just like to point out that you make Jeppe's approach generic as follows:

public class StructuralEqualityComparer<T>: IEqualityComparer<T>
{
    public bool Equals(T x, T y)
    {
        return StructuralComparisons.StructuralEqualityComparer.Equals(x, y);
    }

    public int GetHashCode(T obj)
    {
        return StructuralComparisons.StructuralEqualityComparer.GetHashCode(obj);
    }

    public static StructuralEqualityComparer<T> Default
    {
        get
        {
            StructuralEqualityComparer<T> comparer = _defaultComparer;

            if (comparer == null)
            {
                comparer = new StructuralEqualityComparer<T>();
                _defaultComparer = comparer;
            }

            return comparer;
        }
    }

    private static StructuralEqualityComparer<T> _defaultComparer;
}

(From an original answer here: https://stackoverflow.com/a/5601068/106159)

Then you would declare the dictionary like this:

var yourDict  = new Dictionary<double[], string>(new StructuralEqualityComparer<double[]>());

Note: It might be better to initialise _defaultComparer using Lazy<T>.


[EDIT]

It's possible that this might be faster; worth a try:

class DoubleArrayComparer: IEqualityComparer<double[]>
{
    public bool Equals(double[] x, double[] y)
    {
        if (x == y)
            return true;

        if (x == null || y == null)
            return false;

        if (x.Length != y.Length)
            return false;

        for (int i = 0; i < x.Length; ++i)
            if (x[i] != y[i])
                return false;

        return true;
    }

    public int GetHashCode(double[] data)
    {
        if (data == null)
            return 0;

        int result = 17;

        foreach (var value in data)
            result += result*23 + value.GetHashCode();

        return result;
    }
}

...

var yourDict = new Dictionary<double[], string>(new DoubleArrayComparer());
Community
  • 1
  • 1
Matthew Watson
  • 104,400
  • 10
  • 158
  • 276
  • I have tested, this approach seems to be slower then using Dictionary and Dictionary string> as what have mentioned here. Dictionary string> works fastest. But I have a question, have listed in the comments in Jeppe post. – william007 Feb 25 '14 at 09:40
  • @william007 Is the alternative code I posted any faster? – Matthew Watson Feb 25 '14 at 09:49
  • Yes, your DoubleArrayComparer is the fastest now! I have posted a comment on the speed. – william007 Feb 25 '14 at 10:16
0

Ok this is what I found so far:

I input an entry (length 4 arrray) to the dictionary, and access it for 999999 times on my machine:

Dictionary<double[], string>( new DoubleArrayStructuralEqualityComparer()); takes 1.75 seconds

Dictionary<Tuple<double...>,string> takes 0.85 seconds

The code below takes 0.1755285 seconds, which is the fastest now! (in line with the comment with Sergey.)

The fastest - The code of DoubleArrayComparer by Matthew Watson takes 0.15 seconds!

    public class DoubleArray
    {
        private double[] d = null;
        public DoubleArray(double[] d)
        {
            this.d = d;
        }

        public override bool Equals(object obj)
        {
            if (!(obj is DoubleArray)) return false;

            DoubleArray dobj = (DoubleArray)obj;
                if (dobj.d.Length != d.Length) return false;

                for (int i = 0; i < d.Length; i++)
                    {
                        if (dobj.d[i] != d[i]) return false;
                    }
                return true;

        }
        public override int GetHashCode()
        {
            unchecked // Overflow is fine, just wrap
            {
                int hash = 17;
                for (int i = 0; i < d.Length;i++ )
                {
                    hash = hash*23 + d[i].GetHashCode();
                }

                return hash;
            }
        }
    }
william007
  • 17,375
  • 25
  • 118
  • 194
  • It seems to me that you focus too much on speed, but even then you do not measure the speed in a particularly relevant matter. Why query a `Dictionary<,>` with only one entry that many times? I think you should focus more on which code looks and feels correct for the problem you want to solve. Regarding the code above: In `Equals` you type-check for `DoubleArray` twice. Your `GetHashCode()` can get really slow for long arrays. One could renounce on using each and every entry if the array was huge. – Jeppe Stig Nielsen Feb 25 '14 at 10:32
  • @JeppeStigNielsen Thanks for the comment, I am querying it multiple time, since the concern of the performance here is the dictionary accessing time, and I believe this is a good approximation to judge how good a method is :) Regarding the GetHashCode(), do you know of any good approach given large array? – william007 Feb 25 '14 at 10:53
  • Have you measured on a dictionary with really many entries, maybe such that the key arrays are long as well? But it all depends on how you are going to use it. If the arrays have length 1000, say, will it be very common that the first `999` entries are the same, and the difference appears only late? If not, maybe stop your `for` loop early. What is fast, depends on how your data looks. – Jeppe Stig Nielsen Feb 25 '14 at 10:58
  • @JeppeStigNielsen what you saying are really making sense - we should design it according to the data. – william007 Feb 25 '14 at 12:12