JodaTime Chronology caching

Question

I have successfully written a new Chronology that represents my company's fiscal calendar, based off of JodaTime. I referred to the JodaTime source code quite a bit, to figure out what I needed to do. One of the things I noticed in the BasicChronology class was the use of the inner class YearInfo to cache the 'firstDayOfYearMillis' - the number of milliseconds since 1970-01-01 (ISO). Figuring that, if it was enough of a performance bottleneck that JodaTime was caching it, I should probably add it to my chronology too.
When I did so, though, I made some modifications. Specifically, I moved the getYearInfo method into the YearInfo inner class, as well as making it static. I also moved the array used to store the cached values into the inner class as well. Full definition of the modified class is like this:

/**
 * Caching class for first-day-of-year millis.
 *
 */
private static final class YearInfo {

    /**
     * Cache setup for first-day-of-year milliseconds.
     */
    private static final int CACHE_SIZE = 1 << 10;
    private static final int CACHE_MASK = CACHE_SIZE - 1;
    private static transient final YearInfo[] YEAR_INFO_CACHE = new YearInfo[CACHE_SIZE];

    /**
     * Storage variables for cache.
     */
    private final int year;
    private final long firstDayMillis;
    private final boolean isLeapYear;


    /**
     * Create the stored year information.
     * 
     * @param inYear The year to store info about.
     */
    private YearInfo(final int inYear) {
        this.firstDayMillis = calculateFirstDayOfYearMillis(inYear);
        this.isLeapYear = calculateLeapYear(inYear);
        this.year = inYear;
    }

    /**
     * Get year information.
     * 
     * @param year The given year.
     * 
     * @return Year information.
     */
    private static YearInfo getYearInfo(final int year) {
        YearInfo info = YEAR_INFO_CACHE[year & CACHE_MASK];
        if (info == null || info.year != year) {
            info = new YearInfo(year);
            YEAR_INFO_CACHE[year & CACHE_MASK] = info;
        }
        return info;
    }
}

My question is... What are the performance or design implications of my changes? I've already decided that my changes should be thread-safe (given answers about final member variables). But why was the original implementation done the way it was, and not like this? I get why most of the methods that are being used effectively staticly aren't (given subclasses of BasicChronology), but I'll admit that some of my OO design stuff is a little rusty (having spent the last two years using RPG).
So... thoughts?

score 2 · Accepted Answer · edited May 23 '17 at 11:52

Regarding correctness, by switching YEAR_INFO_CACHE to static, you've introduced a minor memory leak. There are a few ways to tell if your static references matter in practice, e.g. do a back-of-the-envelope approximation of how large the cache will grow based on what you know about the data; profile the heap during/after a load test of your application; etc.

You're caching such small objects that you probably can cache a lot of them without a problem. Still, if you find that the cache needs to be bounded, then you have a few options, such as an LRU cache, a cache based on soft references instead of direct (strong) references, etc. But again, I emphasize that for your particular situation, implementing either of these might be a waste of time.

To explain the theoretical problem with static references, I'll refer to other posts, rather than reproducing them here:
1. Are static fields open for garbage collection?
2. Can using too many static variables cause a memory leak in Java?

Also regarding correctness, the code is thread safe not because references are final, but rather because the YearInfo values created by multiple threads for some cache position must be equal, so it doesn't matter which one ends up in the cache.

Regarding design, all of the YearInfo related stuff in the original Joda code is private, so the YearInfo details including caching are well encapsulated. This is a good thing.

Regarding performance, the best thing to do is profile your code and see what's using a significant amount of CPU. For profiling, you want to see whether the time spent in this code matters in the context of your entire application. Run your app under load, and check if this particular part of the code matters. If you don't see a performance problem in this code even without the YearInfo cache, then it's probably not a good use of time to work on / worry about that cache. Here is some information about how to do the check:
1. Performance profiler for a java application
2. How to find CPU-intensive class in Java?
That said, the converse is true -- if what you've got is working, then leave it as is!

score 2 · Answer 2 · answered Aug 01 '11 at 02:01

I wrote the original code that caches into YearInfo objects. Your solution to encapsulate more logic into the YearInfo class is perfectly fine and should perform just as well. I designed the YearInfo based on intent -- I wanted a crude data pair and nothing more. If Java supported structs I would have used one here.

As for the cache design itself, it was based on profiling results to see if it had any impact. In most places, Joda-Time lazily computes field values, and caching them for later did improve performance. Because this particular cache is fixed in size, it cannot leak memory. The maximum amount of memory it consumes is 1024 YearInfo objects, which is about 20k bytes.

Joda-Time is full of specialized caches like this, and all of them showed measurable performance improvement. I cannot say how effective these techniques are anymore, since they were written and tested against JDK 1.3.

JodaTime Chronology caching

2 Answers2