-3

For my question, I have gone through this thread, but did not get help for my problem. I am using following line:

String[] result = s.split("\\",-1);

In my Date class and calling:

Date date1 = new Date("20\01\2012");

But it does not work. I get an exception:

Exception in thread "main" java.util.regex.PatternSyntaxException: Unexpected internal error near index 1
\
 ^
    at java.util.regex.Pattern.error(Pattern.java:1924)
    at java.util.regex.Pattern.compile(Pattern.java:1671)
    at java.util.regex.Pattern.<init>(Pattern.java:1337)
    at java.util.regex.Pattern.compile(Pattern.java:1022)
    at java.lang.String.split(String.java:2313)
    at Date.<init>(Date.java:84)
    at Date.main(Date.java:279)
Community
  • 1
  • 1
user3265048
  • 349
  • 1
  • 2
  • 7
  • 6
    **Always** use `SimpleDateFormat` to parse dates. Don't try and implement the logic yourself, it's far more difficult that you might think. – Boris the Spider Feb 06 '14 at 23:58
  • possible duplicate of [String.replaceAll with Backslashes error](http://stackoverflow.com/questions/1701839/string-replaceall-with-backslashes-error) – chrylis -cautiouslyoptimistic- Feb 07 '14 at 00:01
  • @BoristheSpider Always? Even when the homework problem is "implement your own date parser without using SimpleDateFormat or anything similar"? – Rainbolt Feb 07 '14 at 00:07
  • 1
    @John in that case find the person (if they can be called that) who gave you this task and beat them senseless with a printout of the over 2000 lines of code in `SimpleDateFormat`. – Boris the Spider Feb 07 '14 at 00:14
  • 1
    Implementing Date is easy... until you have to deal with Julian - Gregorian calendar jumps in different countries, different ordering of day/month/year, the names of the dates in different countries. And this is just the date, not yet talking about the time (with all the quirks about time zone shift and DST). – nhahtdh Feb 07 '14 at 04:49

2 Answers2

6

I presume this is your own Date class, not java.util.Date. You will want to write new Date("20\\01\\2012") if you want the argument to contain backslashes. You will also have to write "\\\\" as the argument to split, in order to get a regular expression that matches a single backslash.

The reason is that \ is a special character in a regular expression, so you must escape it with another backslash. So the regular expression you want is actually \\. But to enter this in your Java code, you must escape each backslash - that is, you must write "\\\\".

Dawood ibn Kareem
  • 77,785
  • 15
  • 98
  • 110
  • Yes as mentioned in my question, This is my own Date class. As learnt, \ is an escape character in String, What i understand from this is(correct me), we need to mention 01\\20\\2012 while creating a String object, and then internally String class stores it as 01\20\2012. Because we are helping String class to interpret the input \\ as \, is my understanding correct? – user3265048 Feb 08 '14 at 09:03
  • Almost. It's not the String class, but the Java compiler itself, that converts two backslashes to one. Whenever you write something inside `""` characters, any backslashes have to be doubled up. – Dawood ibn Kareem Feb 08 '14 at 11:04
  • As you said, compiler itself is converting two backslashes to one, but Does String class stores this object as 01\\20\\2012 or 01\20\2012? Because String.split() method regex parameter is expecting "\\\\" rather than "\\" – user3265048 Feb 10 '14 at 06:38
  • It stores it as 01\20\2012. I already explained why you need to type four backslashes inside `String.split()`. – Dawood ibn Kareem Feb 10 '14 at 06:38
  • Yes, you said, "You will also have to write "\\\\" as the argument to split, in order to get a regular expression that matches a single backslash." But for single backslash, why do i need four back slash as argument? any intuition? – user3265048 Feb 10 '14 at 10:03
  • Because in regular expressions, backslashes are special. They're used to indicate special sequences, like `\d` for a number, `\s` for space characters and so on. Therefore, if you want a regular expression that indicates a backslash, just \ won't do - it has to be \\. But within a Java String literal, backslashes are special too, in a slightly different way. In a String literal, `"\t"` is a tab character, and `"\n"` is a newline and so on, so `"\"` won't be a backslash - it has to be `"\\"`. That is to say, when you write any string with backslashes in a Java program, you actually ... – Dawood ibn Kareem Feb 10 '14 at 11:17
  • ... need to write \\ for each backslash. But because you want to have two backslashes in the regular expression, you need to write \\ for each one; and that's four backslashes in total. – Dawood ibn Kareem Feb 10 '14 at 11:18
  • After we give input as "01\\20\\2012" to String class, you said, "String class stores it as 01\20\2012." If this is true, then this one back slash required only two back slash in regex, right? – user3265048 Feb 12 '14 at 05:13
  • Every time you type \\ in the source of a Java program, it gets processed as \. No matter what class it's going into. That is just how the Java compiler works. So when you type `"01\\20\\2012"`, you get a `String` whose value is `01\20\2012`. And yes, you do want a regular expression with two backslashes. But because you're going to type that regular expression into a Java program, to get \\ as your regular expression, you need to type `"\\\\"`. – Dawood ibn Kareem Feb 12 '14 at 06:31
5

In a regex you need 4 \ to match one (the regex only needs two but it's a Java string so you need to escape each of them as well):

String[] result = s.split("\\\\",-1);

Now if you want to parse a date you should use DateFormat#parse...

assylias
  • 321,522
  • 82
  • 660
  • 783