4

We're facing significant performance issues with TimeZone::getTimeZone(String) being a complete bottleneck. It is taking a lock on the class itself (since the method is static), and currently almost all execution threads are waiting to acquire this lock.

I came up with the following solution. It is proven to be a great boost to performance.

private static final Object TIME_ZONE_CACHE_LOCK = new Object();
private static volatile Map<String, TimeZone> ourCachedTimeZones = new HashMap<>();

public static TimeZone getTimeZoneFor(String timeZoneId)
{
    TimeZone timeZone = ourCachedTimeZones.get(timeZoneId);

    if (timeZone == null)
    {
        TimeZone newTimeZone = TimeZone.getTimeZone(timeZoneId);

        synchronized (TIME_ZONE_CACHE_LOCK)
        {
            timeZone = ourCachedTimeZones.get(timeZoneId);

            if (timeZone == null)
            {
                timeZone = newTimeZone;
                Map<String, TimeZone> cachedTimeZones = new HashMap<>(ourCachedTimeZones);
                cachedTimeZones.put(timeZoneId, timeZone);
                ourCachedTimeZones = cachedTimeZones;
            }
        }
    }

    // Clone is needed since TimeZone is not thread-safe
    return (TimeZone) timeZone.clone();
}

The question I have: Does anyone know whether or not it is safe to cache TimeZone instances outside the TimeZone class? Meaning that TimeZone/ZoneInfo/ZoneInfoFile does some magic to internally update its cache so that the application cache I have here is not coherent with the one inside TimeZone.

And before someone suggests - it is not an option to upgrade to JDK 8 Date/time API, nor Joda time.

And before someone complains :-) - I know the double-check idiom is usually not recommended.

Cowboy
  • 515
  • 1
  • 6
  • 13
  • Aside from anything else - you're using `HashMap` (unsafe for multiple threads) outside a lock. Don't do that. When I originally wrote the comment, I hadn't noticed that you create a new map each time, but even so this is still "clever" code which may well be very subtly wrong. Why not use a map designed for concurrent access? Even if it occasionally means duplicate work, it would make the code simpler *and* safer. – Jon Skeet Jun 02 '15 at 12:28
  • It is only concurrent access of reads - any changes are made by an atomic set operation of a volatile field. – Cowboy Jun 02 '15 at 12:30
  • 1
    @Cowboy the volatile thing is the reference to the map, not the internal variables of the map. – Andy Turner Jun 02 '15 at 12:31
  • (As an aside, Guava has plenty of support for caching. It sounds like you should look at that.) – Jon Skeet Jun 02 '15 at 12:31
  • I know what volatile does, but changing on the map referenced by the volatile is never done. A new map is created with old values and then atomically assigned to the volatile field. – Cowboy Jun 02 '15 at 12:33
  • Your code looks fine but would probably be more efficient with a ConcurrentHashMap. See for example: http://stackoverflow.com/questions/11126866/thread-safe-multitons-in-java - If you are on Java 8 the code can be reduced to a couple of lines (otherwise it's a bit more involved, but not much more than your current solution). – assylias Jun 02 '15 at 12:41
  • @assylias Thanks. Not sure why ConcurrentHashMap would be more efficient than HashMap, though :-) – Cowboy Jun 02 '15 at 12:46
  • @Cowboy It is maybe worth testing if performance is important. You will save the volatile read + one lock acquisition + the map copy when creating a new timezone but there will be an additional overhead when reading so not sure which wins. If you only ever access a handful of timezones your current option is possibly faster. – assylias Jun 02 '15 at 12:52

2 Answers2

2

The TimeZones are fixed once the JVM has loaded them from a file. Check the Timezone Updater Utility from Oracle: http://www.oracle.com/technetwork/java/javase/tzupdater-readme-136440.html

You need to bounce the JVM to have the update applied.

Note that a Cache would be a cache for your purposes. Some other libraries still might (re) load the TimeZone using the slow path.

But here comes the big caveat: You will not gain much. The HotSpot implementation will already cache:

http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/8u40-b25/sun/util/calendar/ZoneInfoFile.java/#125

The call stack is:

  • ZoneInfoFile::getZoneInfo0(String)
  • ZoneInfoFile::getZoneInfo(String)
  • ZoneInfo::getTimeZone(String)
  • TimeZone::getTimeZone(String)

Because the returned Zone is mutable, it is a defensive copy.

Fabian Lange
  • 1,786
  • 1
  • 14
  • 18
  • Thanks I interpret you that it is perfectly fine to cache the instances like I do, i.e. there will not be a mismatch between TimeZone caching (as you correctly states) after the TimeZone have been loaded by the VM. And regarding the big caveat - I will gain plenty. I have already verified the performance increase. – Cowboy Jun 02 '15 at 13:52
  • @Cowboy i would be interested in the profiles, I really wonder what part of the code you skip now is responsible for your performance drop. Which JDK do you exactly use? – Fabian Lange Jun 03 '15 at 06:14
  • I'm using JDK7. The problem is the static synchronized when fetching a time zone (actually, it is GMT+/-XXXX timezone offsets in this case). The testing was done in a real scenario where throughput went from 4k /s to 18k /s. Yup, it sounds incredible but this is the outcome. – Cowboy Jun 03 '15 at 07:01
0

This should work. Few suggestions though:

  1. You can use a concurrent hashmap
  2. Or you could use Guava (https://code.google.com/p/guava-libraries/wiki/CachesExplained)
  3. I've tried this before and I had used a TTL of 1 day for the cache entries. Everyday the cache gets refreshed (lazy loading).
Farhad
  • 388
  • 2
  • 13