0

I have a CSV that contains timestamps in the following formats:

yyyy-MM-dd HH:mm:ssX

yyyy-MM-dd HH:mm:ss.SX

yyyy-MM-dd HH:mm:ss.SSX

yyyy-MM-dd HH:mm:ss.SSSX

yyyy-MM-dd HH:mm:ss.SSSSX

yyyy-MM-dd HH:mm:ss.SSSSSX

yyyy-MM-dd HH:mm:ss.SSSSSSX

How can I parse a string that could contain any one of the above formats?

The following code can parse the timestamp when 3-6 nanoseconds are present, but fails when the nano seconds aren't present or are less than 3:

String time = "2018-11-02 11:39:03.0438-04";
DateFormat sdf = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss.SSSSSSX");            
Date date = sdf.parse(time);
System.out.println("Date and Time: " + date.getTime());

I currently have a method that iterates from 0-6 and generates a string with a number of "S" equal to the value of the iterated variable. The method attempts to parse the string within a try/catch until the string is successfully parsed. For example, the string 2018-11-02 11:39:03.0438-04 will attempt to be parsed five times before being successful.

The CSV is an export of a PostgreSQL table that has columns with type TIMESTAMP WITH TIME ZONE and appears to cut off trailing "0" nanosecond places.

I'm using Java 8 and am open to any external libraries (Joda?).

Brett
  • 719
  • 1
  • 10
  • 16
  • 1
    Efficiency is a measure that can rarely be given outside of use. If you are worried about space or time use, you should benchmark. Make your code maintainable first and then benchmark if this is a critical section. This is probably off-topic, in part because there is no code, in part because asking for API recommendations is off-topic, and partly because there isn't a question about the code here. –  Nov 07 '18 at 15:14
  • This is not off-topic. It's asking for code that parses the above sets of possible strings. The focus of the question is not efficiency. I've removed "most efficiently" from the question, but this will probably encourage responses no better than my currents solution. – Brett Nov 07 '18 at 15:25
  • Show a [mcve] and maybe it'll be closer to being on-topic. Alternatively, if you have a working example but just want it reviewed, there is a SE for that (if what you have meets the criteria for that SE). As you can see, open ended questions of this type will lead to low-quality answers. –  Nov 07 '18 at 15:45
  • MCVE for questions around code that has problems - I'm asking HOW to do something. The question can be answered will absolutely no knowledge of my code. – Brett Nov 07 '18 at 15:56
  • I added a snippet from Amongalen's answer that shows parsing the timestamp when 3-6 nano seconds are present. Again, this isn' particularly important because the question is how to parse the string without knowing the number nanoseconds present. – Brett Nov 07 '18 at 15:58
  • Nevertheless, all the recommendations for this site ask you to provide a [mcve] if you have one so we have some context. This site is about code, and so we should be staring at code. For example, given you've made me think about this in the abstract I have a recommendation, but without seeing your code I'm less likely to know what it is you are exactly doing. (I'd massage the data first before trying to parse it. Just lop off the nanos to some level, and add 0ns to args until they are all the same.) –  Nov 07 '18 at 16:09
  • 1
    I recommend you avoid the `SimpleDateFormat` class. It is not only long outdated, it is also notoriously troublesome. Today we have so much better in [`java.time`, the modern Java date and time API](https://docs.oracle.com/javase/tutorial/datetime/). – Ole V.V. Nov 07 '18 at 22:29
  • There are 1,000,000,000 nanoseconds in a second, and not only 1,000,000. So shouldn't you be expecting something with nine fractional digits? – MC Emperor Nov 08 '18 at 22:03

3 Answers3

3

You'd better use Java Time API1, from the package java.time.

Date, SimpleDateFormatter and Calendar classes are flawed and obsolete.

The DateTimeFormatter class provides numerous options, so you can configure all you need. Note that by using the method appendFraction, the nanos are right-padded.

String[] dateStrs = {
    "2018-11-02 11:39:03.4-04",
    "2018-11-02 11:45:22.71-04",
    "2018-11-03 14:59:17.503-04"
};

DateTimeFormatter f = new DateTimeFormatterBuilder()
    .appendPattern("yyyy-MM-dd HH:mm:ss.")
    .appendFraction(ChronoField.NANO_OF_SECOND, 1, 9, false)
    .appendPattern("X")
    .toFormatter();

// Single item:
LocalDateTime date = LocalDateTime.parse("2018-11-02 11:39:03.7356562-04", f);

// Multiple items:
List<LocalDateTime> dates = Arrays.asList(dateStrs).stream()
    .map(t -> LocalDateTime.parse(t, f))
    .collect(Collectors.toList());

1 Java 8 new Date and Time API is heavily influenced by Joda Time. In fact the main author is Stephen Colebourne, the author of Joda Time.

MC Emperor
  • 22,334
  • 15
  • 80
  • 130
  • A note from API on DateTimeFormatterBuilder: `This class is a mutable builder intended for use from a single thread.` https://docs.oracle.com/javase/8/docs/api/java/time/format/DateTimeFormatterBuilder.html – xmar Mar 10 '20 at 16:21
0

The first 19 characters are identical.

Also, you have different lengths in the different cases. You can use a switch to test the length of the String and handle the separate cases for the different possible values.

Lajos Arpad
  • 64,414
  • 37
  • 100
  • 175
0

I'm not sure but something like this seems to work for me:

String time = "2018-11-02 11:39:03.0438-04";
DateFormat sdf = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss.SSSSSSX");            
Date date = sdf.parse(time);
System.out.println("Date and Time: " + date.getTime());

In general, you want to you the longest format possible, with 6x S in this case.

Amongalen
  • 3,101
  • 14
  • 20
  • This fails the case when the number of nanoseconds is, 0, 1, or 2. – Brett Nov 07 '18 at 15:24
  • It fails any other number of decimals than three. To `SimpleDateFormat` `S` is for millisecond, so for example hundredths of seconds (two decimals) or ten-thousandths (four decimals) will give wrong results. – Ole V.V. Nov 07 '18 at 22:28