0

How can I use a hash-table (is it possible ?) to query for multiple parameters belonging to the same object?

Let me explain.

If I have the following array of objects

persons = [{name: "AA"}, {name: "BB"}, {name: "CA"}]

And I want to store them on a hash table, using the name as the value the

hashtable.put(persons[0]); // will compute hash("AA")
hashtable.put(persons[1]); // will compute hash("BB")
hashtable.put(persons[2]); // will compute hash("CA")

This would allow me to query my hashtable by name very fast.

My question is, Is there any implementation of a hash-table that would allow me to query for multiple parameters for more complex objects like this

persons = [{name: "AA", city: "XX"}, {name: "BB", city: "YY"}, {name: "CA", city: "ZZ"}]

For example. Look for names = "AA" and cities = "ZZ"

If hash-tables are not for this type of operations which algorithms or Data structures are the best for this type of operations?

eli.rodriguez
  • 480
  • 2
  • 9
  • 22

2 Answers2

2

In python you are able to use tuples as keys in your hashmap:

persons = [{name: "AA", city: "XX"}, {name: "BB", city: "YY"}, {name: "CA", city: "ZZ"}]

hashmap = {}
for d in persons:
    hashmap[(d["name"], d["city"])] = ...

Then you can query your hashmap like so:

hashmap[(name, city)]

In other languages you should be able to implement something similar (using groups of elements as keys). This may require implementing a custom hash, but hash map is still the correct data structure for this.

Primusa
  • 13,136
  • 3
  • 33
  • 53
  • Do you know what kind of algorithm are they using to mix both values? name and city. I guess if the hash is too complex that can undermine the efficiency of the data structure. – eli.rodriguez Jan 31 '19 at 22:28
  • Hash maps are implemented as an array. Each key is transformed into an index using a hash function, letting you map keys to values which are stored in the array. All you need to do is implement a hash function for tuples. One example would be to just add the hashes for each element and take the modulo (by length of the array) and use that as the hash of the tuple. It would help to know what language you're using. – Primusa Jan 31 '19 at 23:40
  • @eli.rodriguez without knowing your language or hash table implementation, it's impossible to say how the hashes would be combined if you left it up to your system to do. As an example of how it can be done reasonably well, see https://stackoverflow.com/q/35985960/410767 – Tony Delroy Feb 01 '19 at 01:35
  • @TonyDelroy I dig up a little more and find out there is something called "bloom filters" but I don't quite sure if that would allow me to know (based on the hash) if that hash "belong" to a "city", "city and name", "name". What I'm trying to find out if the given a "huge" set of objects and being sure that I will filter them by 3 properties (with any combination) being able to filter the set without going through the whole set. ex. the filter function in JS – eli.rodriguez Feb 01 '19 at 02:32
  • 1
    @eli.rodriguez bloom filters will not help you at here. They are good for telling you if something definitely doesn't belong in a set of elements (may have false positives due to collisions) and that is all. If every time you are going to filter by all three properties (i.e. prop1 AND prop2 AND prop3) hash tables are probably still your best bet since they're comparatively easy. If you want more complex queries such as something like (prop1 is a OR b) AND (prop2 is c) OR (prop3 is d), and to select all elements that have those properties, then databases become your best bet. – Primusa Feb 01 '19 at 02:49
  • @Primusa exactly what I wanted to know. Thanks. Just as a bonus, do you know what kind of algorithms does databases uses under de hood to achieve indexation. Again thanks for you explanation – eli.rodriguez Feb 01 '19 at 02:55
  • 1
    It depends on the database, i.e. relational vs non-relational. These algorithms are really complicated, but to point you towards that direction relational databases use B+ trees https://en.wikipedia.org/wiki/B%2B_tree. – Primusa Feb 01 '19 at 03:02
1

Hash tables require complete keys to be known. If a key consists of several fields, and even one is unknown it doesn't work.

Maybe you are looking for something like Partitioned Hash Functions:

http://mlwiki.org/index.php/Partitioned_Hash_Function_Index

Which are used in relational databases that support multidimensional indices.

  • I see the key is on the hash. not the data structure itself. you point me in the right direction. Do you know any other algorithms to "mix" several values in one hash? Are there a list of the different algorithms to mix hashes? – eli.rodriguez Jan 31 '19 at 22:30