0

I'm working on a mini project where I'm comparing two hashing algorithms, specifically Robin Hood hashing and Double Hashing. My goal is to analyze and compare the performance of these algorithms in terms of lookup time, collision count, longest probe length, and average probe length. To achieve this, I've planned to conduct various tests with different load factors, ranging from 0 to 1, to observe how each algorithm performs under different conditions. I've tried to implement a counter into the contain() method with the variables totalTimeForLookup, totalCollisions, totalProbeLength.

Here's the structure of my code using the Robin Hood hashing algorithm as an example:

package com.example;

public class RobinHood<T> {

    Object[] hashTable;
    int maxDisplacement;
    int itemCount;

    // Additional fields for collision detection:
    private long totalTimeForLookup;
    private long totalCollisions;
    private long totalProbeLength;

    public RobinHood() {
        hashTable = new Object[1024];
        maxDisplacement = 0;
        itemCount = 0;
        totalTimeForLookup = 0;
        totalCollisions = 0;
        totalProbeLength = 0;
    }

    public boolean insert(T x) {
        if (contains(x)) {
            return false;
        }

        int loc = hash(x);
        int displacement = 0;

        while (true) {
            if (hashTable[loc] == null) {
                hashTable[loc] = x;
                itemCount++;
                return true;
            } else if (calculateDisplacement((T) hashTable[loc], loc) >= displacement) {
                displacement++;
                loc = (loc + 1) % hashTable.length;
                maxDisplacement = Math.max(displacement, maxDisplacement);
            } else {
                T temp = x;
                x = (T) hashTable[loc];
                hashTable[loc] = temp;
                loc = (loc + 1) % hashTable.length;
                displacement = calculateDisplacement(x, loc);
                maxDisplacement = Math.max(displacement, maxDisplacement);
            }
        }
    }

    public boolean contains(T x) {
        long startTime = System.nanoTime();
        int loc = hash(x);

        for (int displacement = 0; displacement <= maxDisplacement; displacement++) {
            int index = (loc + displacement) % hashTable.length;
            if (x.equals(hashTable[index])) {
                long endTime = System.nanoTime();
                totalTimeForLookup += (endTime - startTime);
                totalProbeLength += (displacement + 1);
                if (displacement > 0) {
                    totalCollisions++;
                }
                return true;
            }
        }

        long endTime = System.nanoTime();
        totalTimeForLookup += (endTime - startTime);
        return false;
    }

    public boolean delete(T x) {
        int loc = hash(x);

        for (int displacement = 0; displacement <= maxDisplacement; displacement++) {
            int index = (loc + displacement) % hashTable.length;
            if (x.equals(hashTable[index])) {
                hashTable[index] = null;
                itemCount--;
                return true;
            }
        }
        return false;
    }

    public int getSize() {
        return itemCount;
    }

    private int calculateDisplacement(T x, int loc) {
        int h = hash(x);
        return (loc >= h) ? (loc - h) : (hashTable.length + loc - h);
    }

    private int hash(T x) {
        int h = x.hashCode();
        if (h < 0) {
            return (x.hashCode() + 1) % hashTable.length + hashTable.length - 1;
        }
        return x.hashCode() % hashTable.length;
    }

    public static <T> int countDistinctElements(T[] arr) {
        RobinHood<T> distinctCounter = new RobinHood<>();

        for (T e : arr) {
            distinctCounter.insert(e);
        }

        return distinctCounter.getSize();
    }

    public double getAverageProbeLength() {
        if (totalTimeForLookup == 0) {
            return 0;
        }
        return (double) totalProbeLength / itemCount;
    }

    public double getAverageLookupTime() {
        if (itemCount == 0) {
            return 0;
        }
        return (double) totalTimeForLookup / itemCount;
    }

    public long getCollisionCount() {
        return totalCollisions;
    }
}

My challenge is that the average lookup time results seem inconsistent compared to the overall time taken for the procedures. For example, both runs yield a similar procedure time of 42 ms, but the calculated average lookup times are different (966.12 ms vs. 768.54 ms). I'm seeking guidance on how to ensure more reliable and consistent results for my performance metrics, particularly the average lookup time, collision count and probe length.

President James K. Polk
  • 40,516
  • 21
  • 95
  • 125

0 Answers0