1

I have following string:

;Spe \,\:\; cial;;;

and I want to split it with semicolon as delimiter, however semicolon preceded by "\" should not be counted as delimiter. So I would like to get something like

["", "Spe \,\:\; cial", "", "", ""]

Update:

Java representation looks like:

String s = ";Spe \\,\\:\\; cial;;;";
lstipakov
  • 3,138
  • 5
  • 31
  • 46
  • 2
    Using backslash as an escape character, how do you want to handle "\\;"? (i.e. double-backslash semi-colon) – searlea Oct 26 '11 at 07:45
  • Yep sorry. \\; should be treated as separator - this is (escaped) backslash plus semicolon. – lstipakov Oct 26 '11 at 07:48

2 Answers2

4

Use a negative look-behind:

(?<!\\\\);

(Note that there's really only a single \ in this expression -- ie, the expression should be (?<!\); -- but the backslash character has to be double-escaped: once for the benefit of the Java compiler, and again for the benefit of the regex engine.)

LukeH
  • 263,068
  • 57
  • 365
  • 409
  • Perhaps I am doing something wrong: Syntax error U_REGEX_MISMATCHED_PAREN near index 6 `(<!\);` – lstipakov Oct 26 '11 at 07:50
  • The code is:`String s = ";Spe \\,\\:\\; cial;;;"; String[] strs = s.split("(<!\\);", 5);` – lstipakov Oct 26 '11 at 07:53
  • Ah, ok, you might need `"(?<!\\\\);"` - escaping the \ once for the Java compiler's benefit, and escaping it again for the regex engine's benefit. – LukeH Oct 26 '11 at 07:56
0

You want to extract the parts captured by the following regex : ;?([^;]*)\\\\?; So search this pattern in your string as long as a match is found :

Pattern pattern = Pattern.compile(";?([^;]*)\\\\?;");
Matcher matcher = pattern.matcher(yourString);
List<String> tokens = new ArrayList<String>();
while(matcher.find()){
   tokens.add(matcher.group(1));
}

String[] yourArray = tokens.toArray(new String[0]); // if you prefer an array 
                                                    // rather than a list
kgautron
  • 7,915
  • 9
  • 39
  • 60