37

I would like to store a group of objects in a hashmap , where the key shall be a composite of two string values. is there a way to achieve this?

i can simply concatenate the two strings , but im sure there is a better way to do this.

user1203861
  • 1,177
  • 3
  • 15
  • 19

9 Answers9

57

You could have a custom object containing the two strings:

class StringKey {
    private String str1;
    private String str2;
}

Problem is, you need to determine the equality test and the hash code for two such objects.

Equality could be the match on both strings and the hashcode could be the hashcode of the concatenated members (this is debatable):

class StringKey {
    private String str1;
    private String str2;

    @Override
    public boolean equals(Object obj) {
        if(obj != null && obj instanceof StringKey) {
            StringKey s = (StringKey)obj;
            return str1.equals(s.str1) && str2.equals(s.str2);
        }
        return false;
    }

    @Override
    public int hashCode() {
        return (str1 + str2).hashCode();
    }
}
Tudor
  • 61,523
  • 12
  • 102
  • 142
  • 2
    Are there any problems due to the hashcodes for ABC,D and AB,CD being the same? Or does the equals being different resolve that? – Scott McIntyre Feb 21 '13 at 16:55
  • 1
    @smackfu: That depends. It would only be a problem if you have many such pairs of strings, because they will hash to the same slot in the table and make lookups less efficient. – Tudor Feb 21 '13 at 18:52
  • @Tudor can you think of any advantage that this solution has over the solution presented by EdgeCase which basically just concatenates the two strings separated by a tilde character? – Zak May 22 '13 at 10:09
  • 1
    @Zak: Well the only difference is that I'm using the two strings concatenated and he's using a tilde between them. His version should do a better job at spreading around pairs of strings that would otherwise give the same result when concatenated. Anyway, I was just giving an example of how to produce the hashcode, I did not aim to make it efficient. – Tudor May 22 '13 at 13:07
  • @Tudor, you don't need obj != null with instanceOf operator, Cheers! – bgs Oct 21 '13 at 17:36
  • The requirement on hashCode is that it is stable (multiple calls obtain the same value in the lifetime of the receiver), and that equal objects have the same hashCode value. Having the same hash values for unequal objects (for example, hash( {"a", "aa"} ) == hash( {"aa", "a"} ) ) impacts the efficiency of the hash tables. Operations on unequal objects that have the same hash value lose the performance benefits of hash tables. What will matter is the proportion of such objects. – Thomas Bitonti Sep 10 '18 at 15:21
  • 1
    i would also prefer defining the variables as `final`. Ensures immutability in case of strings. – unknownerror Sep 18 '19 at 12:49
18

You don't need to reinvent the wheel. Simply use the Guava's HashBasedTable<R,C,V> implementation of Table<R,C,V> interface, for your need. Here is an example

Table<String, String, Integer> table = HashBasedTable.create();

table.put("key-1", "lock-1", 50);
table.put("lock-1", "key-1", 100);

System.out.println(table.get("key-1", "lock-1")); //prints 50
System.out.println(table.get("lock-1", "key-1")); //prints 100

table.put("key-1", "lock-1", 150); //replaces 50 with 150
Zoe
  • 27,060
  • 21
  • 118
  • 148
TMtech
  • 1,076
  • 10
  • 14
11
public int hashCode() {
    return (str1 + str2).hashCode();
}

This seems to be a terrible way to generate the hashCode: Creating a new string instance every time the hash code is computed is terrible! (Even generating the string instance once and caching the result is poor practice.)

There are a lot of suggestions here:

How do I calculate a good hash code for a list of strings?

public int hashCode() {
    final int prime = 31;
    int result = 1;
    for ( String s : strings ) {
        result = result * prime + s.hashCode();
    }
    return result;
}

For a pair of strings, that becomes:

return string1.hashCode() * 31 + string2.hashCode();

That is a very basic implementation. Lots of advice through the link to suggest better tuned strategies.

Community
  • 1
  • 1
Thomas Bitonti
  • 1,179
  • 7
  • 14
  • 1
    "a new string instance every time the hash code is computed" - hahaha, well spotted! – mike rodent Dec 10 '16 at 14:22
  • does it make any difference if the hashcode is calculated by XOR instead of `*` and `+`, since the String hashCode uses most bits in the hashcode anyway? – SOFe Sep 07 '18 at 10:09
  • XOR and other bitwise operators are faster than multiplication and addition. Whether the difference is significant in the example depends on the implementation of String.hashCode(). Also, one should be extra careful to make sure the new implementation has good properties as a hash function. – Thomas Bitonti Sep 10 '18 at 15:14
8

Why not create a (say) Pair object, which contains the two strings as members, and then use this as the key ?

e.g.

public class Pair {
   private final String str1;
   private final String str2;

   // this object should be immutable to reliably perform subsequent lookups
}

Don't forget about equals() and hashCode(). See this blog entry for more on HashMaps and keys, including a background on the immutability requirements. If your key isn't immutable, then you can change its components and a subsequent lookup will fail to locate it (this is why immutable objects such as String are good candidates for a key)

You're right that concatenation isn't ideal. For some circumstances it'll work, but it's often an unreliable and fragile solution (e.g. is AB/C a different key from A/BC ?).

Brian Agnew
  • 268,207
  • 37
  • 334
  • 440
  • if we have to many entries (~77,500) it can we can found ourselves with hash collision? – lolo Aug 29 '17 at 11:38
  • Collisions will occur in proportion to the range of the hash function, the number of entries being hashed, the distribution of the values of the entries, how well behaved is the hash function, and the available space for the hash map. Without know more of these details, whether 77,500 entries will have many or few (or no) collisions can't be determined. Note that even with very good hash functions, collisions will likely occur. For a typical hash map implementation, what will matter is not whether any collisions occur, but how many in proportion to the total entries. – Thomas Bitonti Sep 10 '18 at 15:27
5

I have a similar case. All I do is concatenate the two strings separated by a tilde ( ~ ).

So when the client calls the service function to get the object from the map, it looks like this:

MyObject getMyObject(String key1, String key2) {
    String cacheKey = key1 + "~" + key2;
    return map.get(cachekey);
}

It is simple, but it works.

EdgeCase
  • 4,719
  • 16
  • 45
  • 73
  • 1
    Yes. The OP stipulated "two strings" explicitly. Composite keys of another kind may require something far more sophisticated. But this is the right answer for the use case, IMHO. However, the "join" character needs more work: the OP did not say that these strings are "only alphanumeric" for example, so they could potentially contain the tilde character. Something very exotic from the wilder planes of Unicode might do the trick otherwise: or ♄ maybe. – mike rodent Jun 01 '19 at 19:27
5

I see that many people use nested maps. That is, to map Key1 -> Key2 -> Value (I use the computer science/ aka haskell curring notation for (Key1 x Key2) -> Value mapping which has two arguments and produces a value), you first supply the first key -- this returns you a (partial) map Key2 -> Value, which you unfold in the next step.

For instance,

Map<File, Map<Integer, String>> table = new HashMap(); // maps (File, Int) -> Distance

add(k1, k2, value) {
  table2 = table1.get(k1);
  if (table2 == null) table2 = table1.add(k1, new HashMap())
  table2.add(k2, value)
}

get(k1, k2) {
  table2 = table1.get(k1);
  return table2.get(k2)
}

I am not sure that it is better or not than the plain composite key construction. You may comment on that.

Val
  • 1
  • 8
  • 40
  • 64
2

Reading about the spaguetti/cactus stack I came up with a variant which may serve for this purpose, including the possibility of mapping your keys in any order so that map.lookup("a","b") and map.lookup("b","a") returns the same element. It also works with any number of keys not just two.

I use it as a stack for experimenting with dataflow programming but here is a quick and dirty version which works as a multi key map (it should be improved: Sets instead of arrays should be used to avoid looking up duplicated ocurrences of a key)

public class MultiKeyMap <K,E> {
    class Mapping {
        E element;
        int numKeys;
        public Mapping(E element,int numKeys){
            this.element = element;
            this.numKeys = numKeys;
        }
    }
    class KeySlot{
        Mapping parent;
        public KeySlot(Mapping mapping) {
            parent = mapping;
        }
    }
    class KeySlotList extends LinkedList<KeySlot>{}
    class MultiMap extends HashMap<K,KeySlotList>{}
    class MappingTrackMap extends HashMap<Mapping,Integer>{}

    MultiMap map = new MultiMap();

    public void put(E element, K ...keys){
        Mapping mapping = new Mapping(element,keys.length);
        for(int i=0;i<keys.length;i++){
            KeySlot k = new KeySlot(mapping);
            KeySlotList l = map.get(keys[i]);
            if(l==null){
                l = new KeySlotList();
                map.put(keys[i], l);
            }
            l.add(k);
        }
    }
    public E lookup(K ...keys){
        MappingTrackMap tmp  = new MappingTrackMap();
        for(K key:keys){
            KeySlotList l = map.get(key);
            if(l==null)return null;
            for(KeySlot keySlot:l){
                Mapping parent = keySlot.parent;
                Integer count = tmp.get(parent);
                if(parent.numKeys!=keys.length)continue;
                if(count == null){
                    count = parent.numKeys-1;
                }else{
                    count--;
                }
                if(count == 0){
                    return parent.element;
                }else{
                    tmp.put(parent, count);
                }               
            }
        }
        return null;
    }
    public static void main(String[] args) {
        MultiKeyMap<String,String> m = new MultiKeyMap<String,String>();
        m.put("brazil", "yellow", "green");
        m.put("canada", "red", "white");
        m.put("USA", "red" ,"white" ,"blue");
        m.put("argentina", "white","blue");

        System.out.println(m.lookup("red","white"));  // canada
        System.out.println(m.lookup("white","red"));  // canada
        System.out.println(m.lookup("white","red","blue")); // USA
    }
}
1
public static String fakeMapKey(final String... arrayKey) {
    String[] keys = arrayKey;

    if (keys == null || keys.length == 0)
        return null;

    if (keys.length == 1)
        return keys[0];

    String key = "";
    for (int i = 0; i < keys.length; i++)
        key += "{" + i + "}" + (i == keys.length - 1 ? "" : "{" + keys.length + "}");

    keys = Arrays.copyOf(keys, keys.length + 1);

    keys[keys.length - 1] = FAKE_KEY_SEPARATOR;

    return  MessageFormat.format(key, (Object[]) keys);}
public static string FAKE_KEY_SEPARATOR = "~";

INPUT: fakeMapKey("keyPart1","keyPart2","keyPart3");
OUTPUT: keyPart1~keyPart2~keyPart3
Nelson Azevedo
  • 311
  • 3
  • 4
0

I’d like to mention two options that I don’t think were covered in the other answers. Whether they are good for your purpose you will have to decide yourself.

Map<String, Map<String, YourObject>>

You may use a map of maps, using string 1 as key in the outer map and string 2 as key in each inner map.

I do not think it’s a very nice solution syntax-wise, but it’s simple and I have seen it used in some places. It’s also supposed to be efficient in time and memory, while this shouldn’t be the main reason in 99 % of cases. What I don’t like about it is that we’ve lost the explicit information about the type of the key: it’s only inferred from the code that the effective key is two strings, it’s not clear to read.

Map<YourObject, YourObject>

This is for a special case. I have had this situation more than once, so it’s not more special than that. If your objects contain the two strings used as key and it makes sense to define object equality based on the two, then define equals and hashCode in accordance and use the object as both key and value.

One would have wished to use a Set rather than a Map in this case, but a Java HashSet doesn’t provide any method to retrieve an object form a set based on an equal object. So we do need the map.

One liability is that you need to create a new object in order to do lookup. This goes for the solutions in many of the other answers too.

Link

Jerónimo López: Composite key in HashMaps on the efficiency of the map of maps.

Ole V.V.
  • 81,772
  • 15
  • 137
  • 161