3

I was developing a part of code where I had to use existing api using Calendar API where I was using purely new API. Got some strange behavior in the conversion, see this example:

SimpleDateFormat df = new SimpleDateFormat("yyyy-MM-dd'T'HH:mm:ssX");

String date1 = "0000-01-01T00:00:00Z";
Calendar calendar = Calendar.getInstance();
calendar.setTime(df.parse(date1));
Instant instant = calendar.toInstant();
ZonedDateTime zonedDateTime = instant.atZone(calendar.getTimeZone().toZoneId());

System.out.println(calendar.getTime() + " " +calendar.getTimeZone().getDisplayName());
System.out.println(instant.toString());
System.out.println(zonedDateTime.toString());

String date2 = "0000-01-01T00:00:00+01:00";
calendar.setTime(df.parse(date2));
instant = calendar.toInstant();
zonedDateTime = instant.atZone(calendar.getTimeZone().toZoneId());

System.out.println(calendar.getTime() + " " +calendar.getTimeZone().getDisplayName());
System.out.println(instant.toString());
System.out.println(zonedDateTime.toString());

The output I am getting is as follows:

Thu Jan 01 01:00:00 CET 1 Central European Standard Time
-0001-12-30T00:00:00Z
-0001-12-30T00:09:21+00:09:21[Europe/Paris]
Thu Jan 01 00:00:00 CET 1 Central European Standard Time
-0001-12-29T23:00:00Z
-0001-12-29T23:09:21+00:09:21[Europe/Paris]

So the first line from Gregorian calendar is correct for both cases:

  • we get 1st of Jan 1:00 AM at +01:00 zone in case1
  • we get 1st of Jan 0:00 AM at +01:00 zone in case2

After converting from Calendar to Instant we already see problem with a date because we are now suddenly:

  • on 30th of Dec (48 hours before) in case 1
  • on 29th of Dec (72 hours before) in case 2 ... also found there is also a small random inaccuracy of couple of hundreds of milliseconds introduced during conversion which you can't see here

Now when we convert next from Instant to ZonedDateTime we are now

  • 9 minutes 21 seconds later because timezone Europe/Paris passed to instant.atZone() resulted in strange timezone of +00:09:21

I tested it more and generally conversion between Calendar and Instant is becoming heavily unreliable for dates before year 1583 while Instant to Local/ZonedDateTime becomes unreliable due to timezone issue for dates before 1911.

Now I know that hardly anybody stores/converts time for dates before 1911 (but I can still imagine such use case), but hey! lets see when Christopher Columbus departed to discovered America!:

1492-08-03
1492-08-11T23:00:00Z
1492-08-11

As a result I've also found that getting epoch millis from both apis yeld different results for same ISO date at early years before 1911 (problem seems to be in Calendar implementation):

System.out.println(Instant.parse("1911-01-01T00:00:00Z").toEpochMilli());
calendar.setTime(df.parse("1911-01-01T00:00:00Z"));
System.out.println(calendar.getTimeInMillis());

What is the correct way to convert so it would 'just work' (tm) ?

Note: So far I think the most safe way is to convert to ISO date string first. Is there any better solution?

Jon Skeet
  • 1,421,763
  • 867
  • 9,128
  • 9,194
domaru
  • 348
  • 2
  • 13
  • You say "just work" when you mean "work like I expect/want/hope". Calculations for historical dates aren't exactly well defined, so if you're hoping to write "correct" generic code that does *calculations* (or conversions) with years ranging from `0000` to the future, then you'll be wasting your time. You want to know when Christopher Columbus departed to discover America? It was `LocalDate.of(1492, 8, 3);`, or at least that's what they claim. [Dates are hard](https://benramsey.com/blog/2014/02/dates-are-hard/) for a very good reason, so it will *never* "just work". – Kayaman Jul 03 '20 at 10:29
  • by expect/want/hope I mean that since( for example) time in millis, is well described in one API: ...the current time as UTC milliseconds from the epoch.... and in other: ...the number of milliseconds since the epoch of 1970-01-01T00:00:00Z ... and given those descriptions, plus the definition of Gregorian calendar, I would think that value X of millis translates to exactly one date in Gregorian calendar. So natural question would be to understand which of these implementations is wrong and how to work around it. – domaru Jul 03 '20 at 11:27
  • 1
    Neither one is wrong, and both are wrong (but neither of them are correct). See the javadoc for [GregorianCalendar.toZonedDateTime](https://docs.oracle.com/javase/8/docs/api/java/util/GregorianCalendar.html#toZonedDateTime--). That's why dates (and especially calculations, conversions and so on) are hard. They're a human convention, not a universal fact. – Kayaman Jul 03 '20 at 12:04
  • There is no year 0, so `SimpleDateFormat` ought not parse a string like `0000-01-01T00:00:00Z` (I know that it does anyway). – Ole V.V. Jul 04 '20 at 07:59

1 Answers1

1

All the calculations you had shown are fully correct in the sense that they preserve the instant/moment expressed in UTC and using proleptic gregorian dates. However:

It is an ahistorical nonsense to expect the applicability of time zones at any time before their historical introduction around year 1900. Given this we should not try to compare date AND time representations for historical dates but only compare the instants.

In this context, we should even go further and better limit ourselves to date precision instead of date-time-precision. That is we should better talk about date conversion only. Nobody has recorded historical dates in millisecond precision or even more because no such precise clocks existed.

Another technical aspect of time zones is the fact that old Java time zones (using java.util.TimeZone) and the new class ZoneId indeed use different sets of rules for years before 1911 in your examples because the so called LMT-lines in the original tz-data-repository managed by IANA (as base of Java zones) are handled differently. As far as I remember, Xueming Chen from Oracle had once mentioned the year 1900 as cutting date for old Java zones when to apply LMT (local mean time for arbritarily chosen city) and when to apply normal zoned time. And some zones like Europe/Paris have even been introduced after 1900 so they still show LMT-time. This explains such differences like 9 minutes 21 secs. In detail, the actual rules for Paris are:

Zone    Europe/Paris    0:09:21 -   LMT 1891 Mar 16
            0:09:21 -   PMT 1911 Mar 11 # Paris Mean Time
# Shanks & Pottenger give 1940 Jun 14 0:00; go with Excoffier and Le Corre.
            0:00    France  WE%sT   1940 Jun 14 23:00
# Le Corre says Paris stuck with occupied-France time after the liberation;
# go with Shanks & Pottenger.
            1:00    C-Eur   CE%sT   1944 Aug 25
            0:00    France  WE%sT   1945 Sep 16  3:00
            1:00    France  CE%sT   1977
            1:00    EU  CE%sT

That means, java.util.TimeZone applies first non-LMT line for all historical dates (PMT line is ignored due to the unchanged offset) after 1911 because of the cutting year 1900. But ZoneId uses the LMT-line - as obviously shown in your examples.

We see the different handling is an arbitrary chosen technical implementation strategy to handle zoned date-times even for moments in historical times before people had precise clocks and had known the concept of time zones at all.

That was about time zone stuff, now let's talk about handling of date conversion.

The new java.time-package only uses proleptic gregorian dates, even for dates when nobody knew what a gregorian date is. Such dates were first introduced by Pope Gregor in year 1582. He cut off 10 days (so the dates from Oct 5 to Oct 14 in year 1582 did not exist) and introduced a different leap year rule deviating from the rules of old Julian calendar. Indeed, the old class java.util.Calendar handles all dates before 1582-10-15 as Julian dates which explains the other differences you had observed. If you combine the cut of 10 days together with the different leap year rules then you will see two days as difference between Julian and Gregorian calendar in the year 0000.

It seems that you wish to reproduce the old behaviour regarding old dates with the new classes in java.time-API. Well, this is NOT possible. Not a bug in the new API but just a design decision to limit to gregorian dates only and not to support Julian calendar dates.

Nevertheless, if you insist on seeing Julian dates before 1583 AND avoid the old classes like java.util.Calendar or SimpleDateFormat then you could use my library Time4J which has conversion methods to java.time-API and use code like this for your Columbus-example (day of departure in Spain):

ChronoHistory history =
    ChronoHistory.ofFirstGregorianReform(); // the same behaviour like in old Java
ChronoFormatter f =
    ChronoFormatter
        .ofPattern("yyyy-MM-dd", PatternType.CLDR, Locale.ROOT, HistoricCalendar.family())
        .withDefault(HistoricCalendar.ERA, HistoricEra.AD)
        .with(history);
HistoricCalendar expected =
    HistoricCalendar.of(history, HistoricEra.AD, 1492, 8, 3);
assertThat(
    f.parse("1492-08-03"), // date of departure of Columbus to America
    is(expected));
assertThat(
    expected.transform(PlainDate.axis()), // transformation to gregorian
    is(PlainDate.of(1492, 8, 12)));
// conversion to java.time-API
LocalDate threetenAlwaysGregorian = PlainDate.of(1492, 8, 12).toTemporalAccessor();

The fact that your conversion resulted in 1492-08-11 (one day earlier) is just due to time zone effect. To avoid such things, you should really limit to date precision, or if not possible, you should at least convert dates at noon and not at midnight.

Meno Hochschild
  • 42,708
  • 7
  • 104
  • 126