16

We know that the dateformat classes are not thread safe. I have a multi-threaded scenario where dateformats needs to be used. I can't really create a new instance in new thread as SimpledateFormat creation seems to be expensive(the constructor ends up calling "compile" which is costly). After some tests the only two options left for me are:

  1. External Synchronization - I really dont want to do this
  2. Cloning in each thread - Don't know whether there are some catches?

Any suggestions?

If guys have faced this before, what direction did you take.

Note: a similar question was asked before, but it was closed pointing to a apache package. I can't use new libraries for this. And I have also read this similar question on SO

Community
  • 1
  • 1
Suraj Chandran
  • 24,433
  • 12
  • 63
  • 94
  • 1
    How is this different from the other SO question? – Tom Hawtin - tackline Feb 18 '11 at 06:24
  • 1
    @tom i am here focussing more on the alternatives, specially clone. Unfortunately none of the answers yet discuss about clone() – Suraj Chandran Feb 18 '11 at 06:54
  • Have you actually evaluated the speed of each solution? You said yourself that only a small % of requests will hit this code. Do you need to worry about this so much? Did you put some load on it? You need to do some legwork here! – sjr Feb 18 '11 at 07:03
  • 1
    @sjr I believe that the current load should not be an excuse for non-effecient program. Optimism would imply that load might increase later. I also understand "Never over-optimize prematurely", but this problem does not come under that rule. – Suraj Chandran Feb 18 '11 at 07:33
  • I really don't understand why one cannot add new libraries. What are your reasons? – Bozho Feb 18 '11 at 07:40
  • @sjr I actually did some testing on the two solutions. Surprisingly, clone() is actually slower than creating a new instance. For a iteration of 10000, clone soln. takes around 850 ms. But if i simply do a new it takes just 400ms. – Suraj Chandran Feb 18 '11 at 08:00
  • Note that I already new that in general clone() is slower but then I though the compile() method would come in to play – Suraj Chandran Feb 18 '11 at 08:14
  • Is it solved,if so please post the working example – Deepak Feb 18 '11 at 14:20
  • How is this not a case of optimizing prematurely? You said yourself that this is going to be hit rarely. If performance becomes a problem, you profile *then* you fix, not the other way around. – sjr Feb 18 '11 at 16:13
  • @Deepak I ended up creating a new instance of SimpleDateFormat for every call. Resons: 1) After testing I found that creating new instance is faster that cloning. 2) I cant use ThreadLocal because I dont have control over the thread-pooling behavior, my code is not supposed to know about it. – Suraj Chandran Feb 19 '11 at 07:47
  • @Suraj: See my answer. It is almost impossible that constructing a new object will be less work than cloning, since there will always be at least as much work in creating, and often more. – Lawrence Dol Feb 20 '11 at 00:00
  • 2
    @sjr: This is not a case of premature optimization - this is a case of a frequently recurring question for which it is worth knowing the general answer and the generally best approach. I agree with Suraj - current loading is not an excuse for an inefficient program. See http://www.softwaremonkey.org/Article/CodeBloat – Lawrence Dol Feb 20 '11 at 00:01
  • @SoftwareMonkey Read this why cloning is slower than new : http://forums.java.net/node/644247 – Suraj Chandran Feb 21 '11 at 06:34
  • @Suraj: See my answer - cloning is significantly faster, at least on modern hardware running modern JVMs. – Lawrence Dol Feb 21 '11 at 07:30
  • @Suraj: It would seem your information is 6 years old. – Lawrence Dol Feb 21 '11 at 07:48
  • For newcomers Java 1.8 now provides `java.time.format.DateTimeFormatter` (from Java 1.8) "This class is immutable and thread-safe". Give a look at: https://docs.oracle.com/javase/8/docs/api/java/time/format/DateTimeFormatter.html – Linuslabo Jun 01 '16 at 13:09

7 Answers7

14

What if you created a class which would format dates using a fixed size pool of precreated SimpleDateFormat objects in round-robin fashion? Given that uncontested synchronization is cheap, this could synchronize on the SimpleDateFormat object, amortizing collisions across the total set.

So there might be 50 formatters, each used in turn - collision, and therefore lock contention, would occur only if 51 dates were actually formatted simultaneously.

EDIT 2011-02-19 (PST)

I implemented a fixed pool as suggested above, the code for which (including the test), is available on my website.

Following are the results on a Quad Core AMD Phenom II 965 BE, running in the Java 6 SE client JVM:

2011-02-19 15:28:13.039 : Threads=10, Iterations=1,000,000
2011-02-19 15:28:13.039 : Test 1:
2011-02-19 15:28:25.450 :   Sync      : 12,411 ms
2011-02-19 15:28:37.380 :   Create    : 10,862 ms
2011-02-19 15:28:42.673 :   Clone     : 4,221 ms
2011-02-19 15:28:47.842 :   Pool      : 4,097 ms
2011-02-19 15:28:48.915 : Test 2:
2011-02-19 15:29:00.099 :   Sync      : 11,184 ms
2011-02-19 15:29:11.685 :   Create    : 10,536 ms
2011-02-19 15:29:16.930 :   Clone     : 4,184 ms
2011-02-19 15:29:21.970 :   Pool      : 3,969 ms
2011-02-19 15:29:23.038 : Test 3:
2011-02-19 15:29:33.915 :   Sync      : 10,877 ms
2011-02-19 15:29:45.180 :   Create    : 10,195 ms
2011-02-19 15:29:50.320 :   Clone     : 4,067 ms
2011-02-19 15:29:55.403 :   Pool      : 4,013 ms

Notably, cloning and pooling were very close together. In repeated runs, cloning was faster than pooling about as often as it was slower. The test, of course, was deliberately designed for extreme contention.

In the specific case of the SimpleDateFormat, I think I might be tempted to just create a template and clone it on demand. In the more general case, I might be tempted to use this pool for such things.

Before making a final decision one way or the other, I would want to thoroughly test on a variety of JVMs, versions and for a variety of these kinds of objects. Older JVMs, and those on small devices like handhelds and phones might have much more overhead in object creation and garbage collection. Conversely, they might have more overhead in uncontested synchronization.

FWIW, from my review of the code, it seemed that SimpleDateFormat would most likely have the most work to do in being cloned.

EDIT 2011-02-19 (PST)

Also interesting are the uncontended single-threaded results. In this case the pool performs on par with a single synchronized object. This would imply that the pool is the best alternative overall, since it delivers excellent performance when contented and when uncontended. A little surprising is that cloning is less good when single threaded.

2011-02-20 13:26:58.169 : Threads=1, Iterations=10,000,000
2011-02-20 13:26:58.169 : Test 1:
2011-02-20 13:27:07.193 :   Sync      : 9,024 ms
2011-02-20 13:27:40.320 :   Create    : 32,060 ms
2011-02-20 13:27:53.777 :   Clone     : 12,388 ms
2011-02-20 13:28:02.286 :   Pool      : 7,440 ms
2011-02-20 13:28:03.354 : Test 2:
2011-02-20 13:28:10.777 :   Sync      : 7,423 ms
2011-02-20 13:28:43.774 :   Create    : 31,931 ms
2011-02-20 13:28:57.244 :   Clone     : 12,400 ms
2011-02-20 13:29:05.734 :   Pool      : 7,417 ms
2011-02-20 13:29:06.802 : Test 3:
2011-02-20 13:29:14.233 :   Sync      : 7,431 ms
2011-02-20 13:29:47.117 :   Create    : 31,816 ms
2011-02-20 13:30:00.567 :   Clone     : 12,382 ms
2011-02-20 13:30:09.079 :   Pool      : 7,444 ms
Lawrence Dol
  • 63,018
  • 25
  • 139
  • 189
  • PS: Looking at what `clone()` has to do, including the clone of DateFormatSymbols, it looks like a not-insignificant amount of work. – Lawrence Dol Feb 18 '11 at 07:15
  • I'm confused. I don't understand why you can't just use a ThreadLocal, and an ExecutorService with a pool size of 50, and then the ThreadLocal will manage that each thread that the ExecutorService uses will have it's own SimpleDateFormat. This has the added benefit of lazy initialization of a DateFormat object, and abstracts all the Thread details away from you. BTW: I have no idea what your actual implementation is, since the website is blocked by my company...you may very well have implemented it the way I suggested, but the way you worded it in your answer is why I'm not sure how you did it – searchengine27 Nov 30 '15 at 22:07
3

Since ThreadLocal is not possible in your case, you should use a pool. Get or create a new instance, use it, and put it in the pool afterwards.

Tobias Schulte
  • 3,765
  • 1
  • 29
  • 33
2

Must point out that option 2: "Cloning in each thread - Don't know whether there are some catches?" is only a viable option because SimpleDateFormat and DateFormat implement a deep clone(). A shallow clone would not be any more thread-safe than just using the same instance.

The reason DateFormat and any subclasses are not thread-safe is due to DateFormat using an internal instance of Calendar stored as a member variable. Calls to format() or parse() rely on clear(), set() and get() fields on the Calendar instance. Any concurrent calls are likely to corrupt the internal Calendar state.

It is also the fact that the classes implement a deep clone() that makes it relatively slow.

gb96
  • 1,674
  • 1
  • 18
  • 26
2

I have used the ThreadLocal to store an instance of each date format i use.

pstanton
  • 35,033
  • 24
  • 126
  • 168
  • @pstaton... can't do that, I am using a Threadpool, and only a small percentage of requests will actually hit this particular request. And for all I know, each of my request may be served by a new thread, so its again non-effecient – Suraj Chandran Feb 18 '11 at 06:28
  • if only a small amount hit the request, whatever method you choose won't have much bearing on performance. – pstanton Jan 24 '13 at 08:27
  • 1
    (small percentage) != (small amount) – Suraj Chandran Jan 24 '13 at 10:26
2

As the similar question that you linked suggested, it really depends on your use case. If you are reusing your threads, then using a ThreadLocal value would be your best bet. If you are not reusing your threads, you have to weigh synchronization vs. object creation. If your synchronized tasks are long running (in the case where you need to parse many dates at once), your threads may end up waiting longer than it would take to create an object.

Personally, I have generally gone the route of using a ThreadLocal value, but that is because all of my use cases have been with thread reuse.

salexander
  • 954
  • 5
  • 10
2

We've switched to using thread-safe FastDateFormat

Uriah Carpenter
  • 6,656
  • 32
  • 28
0

I would have a small cache of SimpleDateFormats you can lock on. Each lock will cost you 1-2 micro seconds, but you might want to see how long creating a new object each time costs you.

Peter Lawrey
  • 525,659
  • 79
  • 751
  • 1,130