4

I have a HashMap that needs to be copied ~100 000 times and the copies will be expanded individually. Since 100 000 copies are a lot (and this is not the only time this happens in my code) this is currently a major bottleneck in my implementation (in fact, it happens so often that it takes up 45% of the runtime, and there's unfortunately no way to limit that number), so I'm looking for the most efficient way to do this.

I found the following options to create a shallow copy of the HashMap original:

//1
 HashMap<T> map = (HashMap<T>) original.clone()

and

//2
HashMap<T> map = new HashMap<T>();
map.putAll(original);

and

//3
HashMap<T> map = new HashMap<T>(original);

In your experience, what is the most efficient way to copy a HashMap? Are there options that I missed (other than iteration through the original, but I guess that isn't really an option)?

Saftkeks
  • 181
  • 2
  • 15
  • try the most to avoid to use a clone method... – ΦXocę 웃 Пepeúpa ツ Apr 18 '17 at 18:15
  • 4
    Can I just ask why you would need 100,000 copies of a HashMap? I can't dream up a scenario where that was a requirement – ControlAltDel Apr 18 '17 at 18:18
  • @ControlAltDel It's used in a simulated evolution of pathogens, where I need the pathogens to replicate and track the most current mutations at each position in the genome for each replicate. The actual question is why would I use Java? ;D – Saftkeks Apr 18 '17 at 18:22
  • @ΦXocę웃Пepeúpaツ In fact, as far as I can see it, clone() is more efficient than option 3, for whatever reason. – Saftkeks Apr 18 '17 at 18:22
  • @Saftkeks you wouldn't want a shallow copy then right? Why aren't u deep copying? – brso05 Apr 18 '17 at 18:22
  • If you are able to measure the performance of your algorithm, can you not do it for the three different implementations? – Koekje Apr 18 '17 at 18:23
  • Possible duplicate of http://stackoverflow.com/questions/10079266/copying-a-java-hashmap – Devendra Lattu Apr 18 '17 at 18:24
  • 2
    [Flyweight pattern](https://en.wikipedia.org/wiki/Flyweight_pattern) might be helpful in your case. – Pavlo Viazovskyy Apr 18 '17 at 18:26
  • @brso05 The objects within the map may be the same, replicating all the objects in the map would cause a huge memory waste. I just need maps with the same objects that I can add other objects to individually without affecting the other maps. – Saftkeks Apr 18 '17 at 18:27
  • @Saftkeks ah ok... – brso05 Apr 18 '17 at 18:28
  • 2
    Is it certain that your copies will be modified? If a substantial proportion in fact go unmodified, then you might benefit from deferring copying until you actually need it. There are various ways you could achieve that, but the ones that immediately occur to me all involve creating a holder or wrapper class around your maps. – John Bollinger Apr 18 '17 at 18:35
  • @JohnBollinger That was actually a quite helpful thought - and easier to implement than you might initially think, so thanks for that! – Saftkeks Apr 18 '17 at 18:45
  • 1
    I'd tend to find a library for persistent data structures which support this sort of operation efficiently. – Louis Wasserman Apr 18 '17 at 19:30

4 Answers4

3

Consider whether you really need copies.

You say that "I just need maps with the same objects that I can add other objects to individually without affecting the other maps". With this in mind, you could create a composite implementation of Map:

class MyCompositeMap<K, V> implements Map<K, V> {
  final Map<K, V> mapThatYouAddThingsTo;
  final Map<K, V> mapThatIsShared;
}

Now, you can implement your methods. For example:

  • Your containsKey method can first check mapThatYouAddThingsTo to see if the key is present there; if so, it returns the value from mapThatYouAddThingsTo. Otherwise, it checks mapThatIsShared.
  • The put method only ever puts things into mapThatYouAddThingsTo, never into mapThatIsShared.

There are some tricky aspects to the implementation (like deduplicating the keys and values in keySet() and entrySet()), but provided that mapThatYouAddThingsTo is much smaller than mapThatIsShared, you will get away with using a lot less memory.

Andy Turner
  • 137,514
  • 11
  • 162
  • 243
1

1 - it is worst. 2 and 3 are almost the same. You are using Map and it is also considered a collection. And why is the clone bad practice you can read here: Why people are so afraid of using clone() (on collection and JDK classes)?

I would choose this:

HashMap<T> map = new HashMap<T>(original);

, because when an API is giving you the ability to write it more elegant - usually the api is taking care of the other things behind the scene in the most appropriate way.

Community
  • 1
  • 1
strash
  • 1,291
  • 2
  • 15
  • 29
0

This is an old question but I think there is something else to mention.

If you just want to create a shallow copy of the map then the option number 3 is the most recommended.

However, if you need to make a copy of a map defined as HashMap<Integer, List<Item>> but you want the original map to stay the same while you change something in the copy. i.e if you remove something from the List in the copy the List in the original should maintain the value.

I have two solutions for this a deep copy function. Right now Java 8 does not provide a native implementation. We can use Guava or Apache Commons Lang. But we can find a work around creating a method to create new instances using foreach method or Stream.collect() method. The former is simple we use a foreach to create a new Instance of the object we want to copy in this case a List<T> Check the generic function here:

public static <T> HashMap<Integer, List<T>> deepCopy(HashMap<Integer, List<T>> original)
{
    HashMap<Integer, List<T>> copy = new HashMap<>();
    for (Map.Entry<Integer, List<T>> entry : original.entrySet()) {
        copy.put(entry.getKey(), new ArrayList<>(entry.getValue()));
    }

    return copy;
}

If you don't want to deal with generics then we will use Stream.collect(). In this case we use the stream to extract the data and we wrapped as a map and create a new instance

public static <T> Map<Integer, List<T>> deepCopyStream(Map<Integer, List<T>> original)
{
    return original
            .entrySet()
            .stream()
            .collect(Collectors.toMap(Map.Entry::getKey, valueMapper -> new ArrayList<>(valueMapper.getValue())));
}

Note

Please, notice that I didn't use <K,V> for the generics because this is not a proper deep copy method, that will work with nested clones of each level. This approach is based on the idea that we have a HashMap<Integer, List<Item>> where the Item class does not contains attributes that requires cloning.

Community
  • 1
  • 1
Teocci
  • 7,189
  • 1
  • 50
  • 48
-1

You need to loop over the items. Easiest way is a Stream. I made the key for the map a String and made a "Pojo" class for your "T"...

public void testMapCopy() {

    // build the orig map
    Map<String, Pojo> orig = new HashMap();
    for (int i = 0; i < 10; i++) {
        orig.put("k" + i, new Pojo("v"+i));
    }

    // make a copy
    Map<String, Pojo> mapCopy = orig.entrySet().stream()
            .collect(Collectors.toMap(e -> e.getKey(), new Pojo(e.getValue().getValue())));

    // change orig
    Pojo pojo = orig.get("k0");
    pojo.setValue("v0-updated!"); 

    // check values
    System.out.println("orig k0: " + orig.get("k0").getValue());
    System.out.println("copy k0: " + mapCopy.get("k0").getValue());
}

Simple class to represent your "T"

private class Pojo {

    private String value;

    public Pojo(String value) {
        this.value = value;
    }

    public void setValue(String value) {
        this.value = value;
    }

    public String getValue() {
        return value;
    }

}
Matt
  • 363
  • 3
  • 11
  • 3
    Why would you do this with streams, rather than just using `Map mapCopy = new HashSet<>(orig);`? And why would you define a wrapper for a `String`? – Andy Turner Apr 18 '17 at 18:45
  • ah, thanks. In responding to you I noticed I did not make the copy of the value. I think he wanted the value objects to not be the same. updated my response to have new value for map value and also update the pojo in orig. – Matt Apr 18 '17 at 18:52