2

Using Java, I have a Map interface which has some items. I want to clear all the data in it to use it again. Which method is more efficient?

params.clear() 

or

params = new HashMap();
Eric Leschinski
  • 146,994
  • 96
  • 417
  • 335
Vikas Gupta
  • 10,779
  • 4
  • 35
  • 42
  • http://stackoverflow.com/questions/6757868/map-clear-vs-new-map-which-one-will-be-better – Maroun Jan 30 '14 at 10:57
  • `clear()` will reuse more objects (at least 1 depending on the implementation). How the runtimes of `clear()` and running the constructor of `HashMap` compare is implementation dependent. I would guess `clear()` runs faster. – Philipp Matthias Schäfer Jan 30 '14 at 10:58
  • 1
    Maybe you should write a test and evaluate the performance? – Saket Jan 30 '14 at 10:59

2 Answers2

4

I would prefer clear() because you can have the Map as final member.

class Foo {
    private final Map<String, String> map = new HashMap<String, String>();

    void add(String string) {
        map.put(string, "a value");
    }

    void clear() {
        map.clear();
    }
}

If you assign a new Map every time you can run into multithreading issues.


Below is an almost threadsafe example for using a Map wrapped in Collections.synchronizedMap but it assigns a new map every time you clear it.

class MapPrinter {

    private static Map<String, String> createNewMap() {
        return Collections.synchronizedMap(new HashMap<String, String>());
    }

    private Map<String, String> map = createNewMap();

    void add(String key, String value) {
        // put is atomic due to synchronizedMap
        map.put(key, value);
    }

    void printKeys() {
        // to iterate, we need to synchronize on the map
        synchronized (map) {
            for (String key : map.values()) {
                System.out.println("Key:" + key);
            }
        }
    }

    void clear() {
        // hmmm.. this does not look right
        synchronized(map) {
            map = createNewMap();
        }
    }
}

The clear method is responsible for a big problem: synchonized(map) will no longer work as intended since the map object can change and now two threads can simultanously be within those synchronized blocks since they don't lock the same object. To make that actually threadsafe we would either have to synchronize completely externally (and .synchronizedMap would be useless) or we could simply make it final and use Map.clear().

void clear() {
    // atomic via synchronizedMap
    map.clear();
}

Other advantages of a final Map (or anything final)

  • No extra logic to check for null or to create a new one. The overhead in code you may have to write to change the map can be quite a lot.
  • No accidential forgetting to assign a Map
  • "Effective Java #13: Favor Immutability" - while the map is mutable, our reference is not.
zapl
  • 63,179
  • 10
  • 123
  • 154
  • 2
    What kind of issues are you talking about? I can't see how this is better than creating a new `Map` in that respect. – Keppil Jan 30 '14 at 10:59
  • 1
    any thread is guaranteed to see a `final` variable. Everything non final may need `volatile` or other synchronization methods. Also `synchronized(map)` would fail if you create a new map because threads could synchronize on different objects. – zapl Jan 30 '14 at 11:00
  • @zapl : You have defined your map as final, so, you have to clear it only. And, What are those multithreading issues? – Abimaran Kugathasan Jan 30 '14 at 11:00
  • I suppose this is an better argument than performance (which i suspect is not a big issue here). And i think that `.clear()` makes the programmers intention to discard some previously known values more clear than creating a new map. – Gyro Gearless Jan 30 '14 at 11:01
  • 1
    @zapl while that is a good argument, the example you propose is not thread safe either (for example `map.clear()` is not atomic and may actually throw an exception if `add` is called concurrently)... And that does not really answer the question of "which is more efficient". – assylias Jan 30 '14 at 11:01
  • @assylias true, the example was not intended to show that. I'll add one. – zapl Jan 30 '14 at 11:06
  • 1
    @zapl What I'm saying is that HashMap is not thread safe anyway, so whether you clear it or use `new HashMap` doesn't make a difference from a thread safety perspective: you will have to add synchronization in both cases. – assylias Jan 30 '14 at 11:07
2

In general: if you don't know how clear() is implemented, you can't guess which one would be more performant. I can come up with synthetic use-cases where one or another would definitely win. If your map does not hold millions and millions or records you can go either way. Performance would be the same.

Specifically: HashMap clears by wiping content of the inner array. Making old map content available for GC immediately. When you create a new Hashmap it also makes old map content available for GC + the HashMap object itself. You are trading a few CPU cycles for slightly less memory to GC

You need to consider other issue:

  • Do you pass this reference to some other code/component? You might want to use clear() so that this other code sees your changes, reverse is also true
  • Do you want no-hassle, no side-effect new map? I'd go with creating a new one.
  • etc
Igor Katkov
  • 6,290
  • 1
  • 16
  • 17