4

I was surprised to discover that certain characters embedded in a year (e.g. $ or &) will "successfully" parse when using DateFormat.parse() with DateFormat.SHORT. For example, "08/01/20&&" will parse into "Sat Aug 01 00:00:00 EDT 2020".

I was even more surprised that I could not find any hits googling the issue.

The exercise is to parse and validate a date. We could scan the string we are parsing for special characters, but this seems inelequent.

Does anyone have any suggestions?

public static void main(String[] args) {
    String s = "08/01/20&&";
    Date value = null;
    try {
        value = getDateFormat().parse(s);
    } catch (ParseException pe) {
        System.out.println("' must be a valid date in the form 'mm/dd/yyyy'");
    }
    System.out.println("Value:" + value);
}

public static DateFormat getDateFormat() {
    DateFormat formatDate = null;
    if (formatDate == null) {
        formatDate = DateFormat.getDateInstance(DateFormat.SHORT);
        //or at least in English locale
        //formatDate = DateFormat.getDateInstance(DateFormat.SHORT,Locale.ENGLISH);
        formatDate.setLenient(false);
    }
    return formatDate;
}
Pshemo
  • 122,468
  • 25
  • 185
  • 269
  • For people who can't reproduce this behaviour try with `Locale.ENGLISH` (I added alternative formatter with this locale in code example, simply switch commented section with current `formatDate`). – Pshemo Sep 08 '15 at 18:42
  • Actually all non-digit characters can be used instead `&` and date will be parsable. – Pshemo Sep 08 '15 at 18:48
  • 1
    The suggestion is: **don't ever use `Date` and `DateFormat` again**. Instead, use classes from the `java.time` package. – MC Emperor May 25 '21 at 19:49
  • Does this answer your question? [SimpleDateFormat parse(string str) doesn't throw an exception when str = 2011/12/12aaaaaaaaa?](https://stackoverflow.com/questions/8428313/simpledateformat-parsestring-str-doesnt-throw-an-exception-when-str-2011-12) – Ole V.V. May 26 '21 at 04:30

2 Answers2

4

The DateFormat returned by DateFormat.getDateInstance is a SimpleDateFormat.

formatDate instanceof SimpleDateFormat => true

The pattern (in Locale.US) is M/d/yy according to the toPattern() method in SimpleDateFormat.

It appears that the parse method will not consider trailing text that extends beyond the date pattern. The following values for s will produce Sat Aug 01 00:00:00 PDT 2020 without an exception being thrown. The 20 is interpreted to be 2020 for the format characters yy, and the trailing text appears to be ignored.

"08/01/20"
"08/01/20&&"
"08/01/20**"
"08/01/20..."
"08/01/20ABCDEFGHIJKLMNOPQRSTUVWXYZ"

The Javadocs for DateFormat.parse state:

Parses text from the beginning of the given string to produce a date. The method may not use the entire text of the given string.

It certainly isn't parsing the entire string. Also, there is nothing special about the & characters you've used, apart from the fact that they're extraneous.

You could get the length of the pattern, then compare it to the length of the inputted string to see if there are extraneous characters. This would work for DateFormat.SHORT, because the expected number of characters would be a maximum of 8.

rgettman
  • 176,041
  • 30
  • 275
  • 357
  • *You could get the length of the pattern,* But `08/01/20` without any extra chars is also longer than `M/d/yy`, so I can’t see how this could help (the working solutions are in the other answer). – Ole V.V. May 26 '21 at 04:42
1

java.time

With the release of Java SE 8 in March 2014, the outdated and error-prone legacy date-time API (java.util date-time types and their formatting type, SimpleDateFormat etc.) was supplanted by java.time, the modern date-time API* and it is strongly recommended to switch to this new API.

With the modern API, you would not have faced this problem e.g.

With valid date:

import java.time.LocalDate;
import java.time.format.DateTimeFormatter;
import java.time.format.FormatStyle;
import java.util.Locale;

public class Main {
    public static void main(String[] args) {
        String s = "08/01/20";
        DateTimeFormatter dtf = DateTimeFormatter.ofLocalizedDate(FormatStyle.SHORT).localizedBy(Locale.ENGLISH);
        System.out.println(LocalDate.parse(s, dtf));
    }
}

Output:

2020-08-01

With invalid date:

import java.time.LocalDate;
import java.time.format.DateTimeFormatter;
import java.time.format.FormatStyle;
import java.util.Locale;

public class Main {
    public static void main(String[] args) {
        String s = "08/01/20&&";
        DateTimeFormatter dtf = DateTimeFormatter.ofLocalizedDate(FormatStyle.SHORT).localizedBy(Locale.ENGLISH);
        System.out.println(LocalDate.parse(s, dtf));
    }
}

Output:

Exception in thread "main" java.time.format.DateTimeParseException:
                    Text '08/01/20&&' could not be parsed, unparsed text found at index 8

What if I want the modern API to behave in the way SimpleDateFormat behaves by default w.r.t. the following rule:

Parses text from the beginning of the given string to produce a date. The method may not use the entire text of the given string.

If you need it, DateTimeFormatter#parse(CharSequence, ParsePosition) is at your disposal:

import java.text.ParsePosition;
import java.time.LocalDate;
import java.time.format.DateTimeFormatter;
import java.time.format.FormatStyle;
import java.util.Locale;

public class Main {
    public static void main(String[] args) {
        String s = "08/01/20&&";
        DateTimeFormatter dtf = DateTimeFormatter.ofLocalizedDate(FormatStyle.SHORT).localizedBy(Locale.ENGLISH);
        LocalDate date = LocalDate.from(dtf.parse(s, new ParsePosition(0)));
        System.out.println(date);
    }
}

Output:

2020-08-01

Learn more about java.time, the modern date-time API* from Trail: Date Time.

Just for the sake of completeness:

Here is what you could have done using the legacy API.

import java.text.DateFormat;
import java.text.ParseException;
import java.text.ParsePosition;
import java.util.Date;
import java.util.Locale;

public class Main {
    public static void main(String[] args) throws ParseException {
        String s = "08/01/20&&";
        ParsePosition pp = new ParsePosition(0);
        Date value = DateFormat.getDateInstance(DateFormat.SHORT, Locale.ENGLISH).parse(s, pp);
        if (value == null || pp.getIndex() != s.length()) {
            System.out.println("The input must be a valid date in the form MM/dd/yyyy");
        } else {
            System.out.println("Value: " + value);
        }
    }
}

Output:

The input must be a valid date in the form MM/dd/yyyy

ParsePosition#getIndex returns the index of the character following the last character parsed, which is the index of the first & in the string, 08/01/20&&.


* For any reason, if you have to stick to Java 6 or Java 7, you can use ThreeTen-Backport which backports most of the java.time functionality to Java 6 & 7. If you are working for an Android project and your Android API level is still not compliant with Java-8, check Java 8+ APIs available through desugaring and How to use ThreeTenABP in Android Project.

Arvind Kumar Avinash
  • 71,965
  • 6
  • 74
  • 110