141

I am trying to parse a file that has each line with pipe delimited values. It did not work correctly when I did not escape the pipe delimiter in split method, but it worked correctly after I escaped the pipe as below.

private ArrayList<String> parseLine(String line) {
    ArrayList<String> list = new ArrayList<String>();
    String[] list_str = line.split("\\|"); // note the escape "\\" here
    System.out.println(list_str.length);
    System.out.println(line);
    for(String s:list_str) {
        list.add(s);
        System.out.print(s+ "|");
    }
    return list;
}

Can someone please explain why the pipe character needs to be escaped for the split() method?

Laurel
  • 5,965
  • 14
  • 31
  • 57
starthis
  • 1,537
  • 2
  • 9
  • 16
  • 13
    The answers below answered the "why," but just FYI, if you're trying to match a literal String you might also look at [Pattern.quote](http://docs.oracle.com/javase/7/docs/api/java/util/regex/Pattern.html#quote(java.lang.String)). It takes a `String` and returns a regex `String` that will match the input (ie, it takes care of all the escaping for you). – yshavit Mar 21 '12 at 16:43
  • +1 for `Pattern.quote` – redDevil Aug 26 '14 at 11:13

3 Answers3

176

String.split expects a regular expression argument. An unescaped | is parsed as a regex meaning "empty string or empty string," which isn't what you mean.

Louis Wasserman
  • 191,574
  • 25
  • 345
  • 413
76

Because the syntax for that parameter to split is a regular expression, where in the '|' has a special meaning of OR, and a '\|' means a literal '|' so the string "\\|" means the regular expression '\|' which means match exactly the character '|'.

ggorlen
  • 44,755
  • 7
  • 76
  • 106
dlamblin
  • 43,965
  • 20
  • 101
  • 140
  • 1
    Thanks for this explanation. I almost always forget to use the double escape. Now that I know why it's that way, it will surely help me remember from now on. – sufinawaz Nov 03 '14 at 21:10
  • What happens if the value of the String line has some Pipe characters? How would you be able to split without splitting escaped pipe \| ? – AlexandreJ Sep 28 '15 at 17:56
  • @AlexandreJ Are you asking how to split a line that looks like: `Some|Delimited|Text|With|An\|Embedded|Pipe|Char` into `("Some", "Delimited", "Text", "With", "An\|Embedded", "Pipe", "Char")`? The split function does not support escaping like this, but you might be able to craft a regular expression that'll work for this case, like with a zero-width negative assertion look behind group: `(?<!\\)\|` which would be `line.split("(?<!\\\\)\\|"); ` – dlamblin Oct 21 '15 at 23:10
6

You can simply do this:

String[] arrayString = yourString.split("\\|");
Community
  • 1
  • 1
Ravinath
  • 1,620
  • 1
  • 14
  • 8
  • you have to escape the \ to use you're regex "yourString.split("\\|")" that's the right formula. – mautrok Dec 07 '15 at 13:57