11

Is there any HashSet implementation in Python? I know HashTable can be represented using dictionaries, but how do we represent HashSet implementation.

I am NOT looking for a data structure with the same methods as HashSets but rather someone with a CONSTANT lookup time, or the order of O(1);

Also, I want to know if the lookup time in a Python Dictionary is constant aka O(1)

Ayush Gupta
  • 8,716
  • 8
  • 59
  • 92
  • Are you looking for `set()`? And for complexity, read this: https://wiki.python.org/moin/TimeComplexity – Daniel Roseman Jan 03 '18 at 08:02
  • `set` is HashSets and for your "also", yes. – Sraw Jan 03 '18 at 08:02
  • does `set()` have a `has` method to check whether a given value exists in the set? I saw `x in s`in the docs, but I'm not sure if it can be used to in `if` or only in a `for`. I'm new to python so haven't really used `set()` before – Ayush Gupta Jan 03 '18 at 08:06
  • Both are constants, and yeah, `set` ~ `HashSet` and `dict` is `HashTable` – Savir Jan 03 '18 at 08:06
  • *I'm not sure if it can be used to in if*. Yes, and it's actually `if x in set()` is the Pythonic (the "recommended") way of testing belonging (`if x in dict()` checks if `x` is among the dictionary `keys`). Take a look to https://docs.python.org/3/reference/datamodel.html#object.__hash__ (there's a lot of interesting information and links to other juicy parts of the documentation in that page) – Savir Jan 03 '18 at 08:21

3 Answers3

5

I think the HashSet implementation you are looking is set(). This answer may help you: What's the difference between HashSet and Set?

And yes, the average time complexity for python dictionary O(1). You may read on why we use the term: "Average time complexity": Time complexity of accessing a Python dict

Volod
  • 1,283
  • 2
  • 15
  • 34
Dimitri_Fu
  • 66
  • 1
  • 7
3

I guess this is what you want. You may define a hash function yourself, like what I did in HashSet or just use the built-in hash() function in Python.

class HashSet:
    CONST = 2 ** 61 - 1

    def __init__(self, size = 10_000):
        self.size = size * 2
        self.contents = [None] * self.size

    def hash(self, x):
        return x % CONST

    def put(self, key):
        idx = self.hash(key) % self.size
        arr = self.contents[idx]
        if arr is None:
            self.contents[idx] = [key]
        elif key not in arr:
            arr.append(key)
        return None

    def get(self, key):
        idx = self.hash(key) % self.size
        arr = self.contents[idx]
        if arr is None or key not in arr:
            return False
        return True


myset = HashSet()
myset.put(123)
myset.put(145)
myset.put(138)
res = myset.get(145)
print(res)
res = myset.get(10)
print(res)


class HashMap:
    def __init__(self, size = 10_000):
        self.size = size * 2
        self.contents = [None] * self.size

    class __Pair:
        def __init__(self, key, value):
            self.key = key
            self.value = value

    def find(self, arr, key):
        for pair in arr:
            if pair.key == key:
                return pair
        return None

    def put(self, key, value):
        idx = hash(key) % self.size
        pair = self.__Pair(key, value)
        arr = self.contents[idx]
        if arr is None:
            self.contents[idx] = [pair,]
            return None

        t = self.find(arr, key)
        if t != None:
            t.value = value
        else:
            arr.append(pair)

    def get(self, key):
        idx = hash(key) % self.size
        arr = self.contents[idx]
        if arr == None:
            raise KeyError(f'{key} is not a valid key')
        t = self.find(arr, key)
        if t == None:
            raise KeyError(f'{key} is not a valid key')
        return t.value


mymap = HashMap()
mymap.put('abc', [123,456])
mymap.put('def', [456,789])
res = mymap.get('abc')
print(res)
res = mymap.get('def')
print(res)
res = mymap.get('defx')
print(res)
Nick Jones
  • 41
  • 1
-3

@Ayush Gupta, I have implemented the HashSet. Please do have a look at it. Comment for any feedback.

class MyHashSet:
def __init__(self):
    self.l = []

def add(self, key: int) -> None:
    if key not in self.l:
        self.l.append(key)

def remove(self, key: int) -> None:
    if key in self.l:
        self.l.remove(key)

def contains(self, key: int) -> bool:
    return key in self.l
    
# Your MyHashSet object will be instantiated and called as such:
obj = MyHashSet()
obj.add(key)
obj.remove(key)
param_3 = obj.contains(key)