35

From what I understand, the backslash dot (\.) means one character of any character? So because backslash is an escape, it should be backslash backslash dot ("\\.")

What does this do to a string? I just saw this in an existing code I am working on. From what I understand, it will split the string into individual characters. Why do this instead of String.toCharArray(). So this splits the string to an array of string which contains only one char for each string in the array?

Alan Moore
  • 73,866
  • 12
  • 100
  • 156
Nap
  • 8,096
  • 13
  • 74
  • 117

2 Answers2

71

My guess is that you are missing that backslash ('\') characters are escape characters in Java String literals. So when you want to use a '\' escape in a regex written as a Java String you need to escape it; e.g.

Pattern.compile("\.");   // Java syntax error

// A regex that matches a (any) character
Pattern.compile(".");  

// A regex that matches a literal '.' character
Pattern.compile("\\.");  

// A regex that matches a literal '\' followed by one character
Pattern.compile("\\\\.");

The String.split(String separatorRegex) method splits a String into substrings separated by substrings matching the regex. So str.split("\\.") will split str into substrings separated by a single literal '.' character.

Stephen C
  • 698,415
  • 94
  • 811
  • 1,216
  • :: It works alright.! But Could you ellaborate more on this, like why four backslashes ? Shouldn't there be three ? – Oliver Dec 09 '14 at 08:59
  • 3
    A literal backslash has to be escaped once in a regex. That gives 2. Those 2 backslashes both need to be escaped in a String literal. That makes 4. Three backslashes will give you a Java compilation error. Try it and see for yourself. – Stephen C Dec 09 '14 at 10:33
  • Why does Pattern.compile("\."); produce a syntax error? – adub3 Dec 24 '14 at 10:25
  • Because `"\."` is a Java string literal, and `\.` is not a legal escape sequence in a Java string literal. – Stephen C Dec 24 '14 at 11:40
7

The regex "." would match any character as you state. However an escaped dot "\." would match literal dot characters. Thus 192.168.1.1 split on "\." would result in {"192", "168", "1", "1"}.

Your wording isn't completely clear, but I think this is what you're asking.

nall
  • 15,899
  • 4
  • 61
  • 65