2

Lists are not hashable. However, I am implementing LSH and I am seeking for a hash function that will correspond a list of positive integers (in [1, 29.000]) to k buckets. The number of lists is D, where D > k (I think) and D = 40.000, where k is not yet known (open to suggestions).


Example (D = 4, k = 2):

118 | 27 | 1002 | 225
128 | 85 | 2000 | 8700
512 | 88 | 2500 | 10000
600 | 97 | 6500 | 24000
800 | 99 | 7024 | 25874

The first column should be given as input to the hash function and return the number of a bucket.


What confuses me is that we do not seek for a function to hash a number, but a column, i.e. a list of positive integers.

Any ideas please?

I am using if that matters

gsamaras
  • 71,951
  • 46
  • 188
  • 305
  • How about just converting it to hashable types, such as tuple? ( ex. hash(tuple([1, 2, 3])) ) – hunminpark May 09 '16 at 21:18
  • @hunminpark you mean something like `print hash(tuple([1,2,3,4,5]))`? That is what @lejlot suggested, but he deleted his answer.. – gsamaras May 09 '16 at 21:19
  • Just to clarify, do you mean you want to take a list and produce a single bucket index, or do you want to take a list of length `n` and produce `n` bucket indices? – mobiusklein May 09 '16 at 21:21
  • @mobiusklein the first. Input is a list of positive integers -> h -> index of bucket. – gsamaras May 09 '16 at 21:22

1 Answers1

7

You can just convert it in a hashable type before:

In [4]: hash(l)
TypeError: unhashable type: 'list'

hash(tuple(l)) % k  # 29000
Out[5]: 70846
gsamaras
  • 71,951
  • 46
  • 188
  • 305
B. M.
  • 18,243
  • 2
  • 35
  • 54