8

I am working on a problem and I have problem with execution times becoming too large and now Im looking for possible optimizations.

The question: Is there any (considerable) difference in performance between using String or Integer as haskey?

The problem is I have a graph with nodes stored in a hashtable with String as key. For example the keys are as follows - "0011" or "1011" etc. Now I could convert these to integers aswell if this would mean a improvement in execution time.

Euklides
  • 564
  • 1
  • 10
  • 35
  • 1
    String's hashcode is cached so it should not make much difference. The problem is probably somewhere else... You should profile your code to find the bottleneck. Note: in a single threaded situation, HashMap will perform slightly better than Hashtable. – assylias Mar 26 '13 at 08:20
  • 1
    I suggest you profile your application and see when it is spending most of it's time. It is highly unlikely that changing the key time this way will make a difference. – Peter Lawrey Mar 26 '13 at 08:32

3 Answers3

4

Integer will perform better than String. Following is code for the hashcode computation for both.

Integer hash code implementation

/**
     * Returns a hash code for this <code>Integer</code>.
     *
     * @return  a hash code value for this object, equal to the 
     *          primitive <code>int</code> value represented by this 
     *          <code>Integer</code> object. 
     */
    public int hashCode() {
    return value;
    }

String hash code implementation

 /**
     * Returns a hash code for this string. The hash code for a
     * <code>String</code> object is computed as
     * <blockquote><pre>
     * s[0]*31^(n-1) + s[1]*31^(n-2) + ... + s[n-1]
     * </pre></blockquote>
     * using <code>int</code> arithmetic, where <code>s[i]</code> is the
     * <i>i</i>th character of the string, <code>n</code> is the length of
     * the string, and <code>^</code> indicates exponentiation.
     * (The hash value of the empty string is zero.)
     *
     * @return  a hash code value for this object.
     */
    public int hashCode() {
    int h = hash;
    if (h == 0) {
        int off = offset;
        char val[] = value;
        int len = count;

            for (int i = 0; i < len; i++) {
                h = 31*h + val[off++];
            }
            hash = h;
        }
        return h;
    }
Jabir
  • 2,776
  • 1
  • 22
  • 31
3

If you have performance problem, it's quite unlikely that the issue is due to the HashMap/HashTable. While hashing string is slightly more expensive than hashing integers, it's rather small difference, and hashCode is cached so it's not recalculated if you use the same string object, you are unlikely to get any significant performance benefit from converting it first to integer.

It's probably more fruitful to look somewhere else for the source of your performance issue. Have you tried profiling your code yet?

Lie Ryan
  • 62,238
  • 13
  • 100
  • 144
1

There is a difference in speed. HashMaps will use hashCode to calculate the bucket based on that code, and the implementation of Integer is much simpler than that of String.

Having said that, if you are having problems with execution times, you need to do some proper measurements and profiling yourself. That's the only way to find out what the problem is with the execution time, and using Integers instead of Strings will usually only have a minimal effect on performance, meaning that your performance problem might be elsewhere.

For example, look at this post if you want to do some proper micro benchmarks. There are many other resources available for profiling etc..

Community
  • 1
  • 1
Erik Pragt
  • 13,513
  • 11
  • 58
  • 64