9

I'm relatively new to Java and I have a question about what type of data structure would be best for my case. I have a set of data which are essentially key-value pairs, however each value may correspond to multiple keys and each key may correspond to multiple values. A simplified example would be:

  • Red-Apple
  • Green-Apple
  • Red-Strawberry
  • Green-Grapes
  • Purple-Grapes

Considering the above example, I need to be able to return what color apples I have and/or what red fruits I have. The actual data will generated dynamically based upon an input file where each set will be anywhere from 100-100,000 values and each value may correspond to hundreds of values in the other set.

What would be the most efficient way to store and parse this data? I would prefer a solution as native to java as possible rather than something such as an external database.

This question is related, but I'm not sure how to apply the solution in my case given that I would need to assign multiple values to each key in both directions.

Community
  • 1
  • 1
user4588937
  • 101
  • 4
  • How about a map? http://docs.oracle.com/javase/7/docs/api/java/util/Map.html – Koogle Feb 20 '15 at 17:55
  • There is also this question: http://stackoverflow.com/questions/2571652/java-many-to-many-association-map – Josh Feb 20 '15 at 17:56
  • @Josh - Thanks, I did not find that question in my search. I will look through the solutions to see if I can successfully implement them for my data. – user4588937 Feb 20 '15 at 18:34
  • 2
    If you could use an external library (a jar), I recommend you use Guava's Table http://docs.guava-libraries.googlecode.com/git/javadoc/com/google/common/collect/Table.html, specifically a HashBasedTable. Guava is a set of utility classes from Google (not a database). – fps Feb 20 '15 at 18:38
  • I would create a class with a List of value a, value b objects and methods to get the values for value a and the values for value b. Probably not the most efficient, but the most straightforward. – Gilbert Le Blanc Feb 20 '15 at 19:56
  • @Magnamag - It looks like a Guava table would work. From my understanding of it, I wouldn't need to use the value element of the table. Is there a performance cost to using the table vs two MultiMaps? Obviously there are syncing issues with using two MultiMaps that the table would resolve. – user4588937 Feb 20 '15 at 20:02
  • @user I think the table accepts null values. However, now I remember that it's optimized to be accessed by row key, while access by column key requires an iteration. It still can be of use to you if you access by one key much more than by another. Otherwise, two Multimaps will work very well. – fps Feb 20 '15 at 20:10
  • I apologize. Guava Table doesn't accept null values. Just use a Boolean – fps Feb 20 '15 at 21:45
  • http://stackoverflow.com/questions/1010879/best-way-to-create-a-hashmap-of-arraylist/1011072#1011072 – Steve Kuo Feb 20 '15 at 23:44

3 Answers3

3

As you can't have duplicate keys in a Map, you can rather create a Map<Key, List<Value>>, or if you can, use Guava's Multimap.

Multimap<String, String> multimap = ArrayListMultimap.create();
multimap.put("Red", "Apple");
multimap.put("Red", "Strawberry");

System.out.println(multimap.get("Red"));  // Prints - [Apple, Strawberry]

But the problem is you can't ask for the keys of a given object, I'll keep looking and make and edit if I find something else, hope it helps.

Still, you can make the reverse yourself by iterating the map and finding the keys for the object.

Luke SpringWalker
  • 1,600
  • 3
  • 19
  • 33
  • 1
    *"But the problem is you can't ask for the keys of a given object."* Sounds like it would be solved by a BiMultiMap. Not sure if this exists, but that's what it'd be called. Or a Directed Graph indexed on node names. – David Ehrmann Feb 20 '15 at 19:30
  • Could you not wrap two of these within a custom object to achieve this – Richard Tingle Feb 20 '15 at 19:59
1

I suggest you use Guava's Table structure. Use color as your row keys and fruit as your column key or the other way round. Specifically, HashBasedTable is well suited for your case.

As per your use case, you wouldn't need to store anything for the values. However, these Tables don't allow null values. You could use a dummy Boolean or any other statistical useful value, i.e. date and time of insertion, user, number of color/fruit pairs, etc.

Table has the methods you need, such as column() and row(). Bear in mind that the docs say that these structures are optimized for row access. This might be OK for you if you plan to access by one key more than by the other.

fps
  • 33,623
  • 8
  • 55
  • 110
0

You can create your own custom data structure

public class MultiValueHashMap<K, V> {
     private HashMap<K, ArrayList<V>> multivalueHashMap = new HashMap<K, ArrayList<V>>();

    public static void main(String[] args) {
        MultiValueHashMap<String, String> multivaluemap = new MultiValueHashMap<String, String>();
        multivaluemap.put("Red", "Apple");
        multivaluemap.put("Green", "Apple");
        multivaluemap.put("Red", "Strawberry");
        multivaluemap.put("Green", "Grapes");
        multivaluemap.put("Purple", "Grapes");

        for(String k : multivaluemap.keySet()){
            System.out.println(k + " : " + multivaluemap.get(k).toString());
        }
    }

    public void put(K key, V value){
        if (multivalueHashMap.containsKey(key)){
            ArrayList<V> values = multivalueHashMap.get(key);
            values.add(value);
        }else{
            ArrayList<V> values  = new ArrayList<V>();
            values.add(value);
            multivalueHashMap.put(key, values);
        }
    }

    public Set<K> keySet(){
        return multivalueHashMap.keySet();
    }

    public ArrayList<V> get(K key){
        return multivalueHashMap.get(key);
    }
}

The output should be

Red : [Apple, Strawberry]

Purple : [Grapes]

Green : [Apple, Grapes]

Anwuna
  • 1,077
  • 11
  • 18