190

In Java RegEx, how to find out the difference between .(dot) the meta character and the normal dot as we using in any sentence. How to handle this kind of situation for other meta characters too like (*,+,\d,...)

Pshemo
  • 122,468
  • 25
  • 185
  • 269
JavaUser
  • 25,542
  • 46
  • 113
  • 139

9 Answers9

338

If you want the dot or other characters with a special meaning in regexes to be a normal character, you have to escape it with a backslash. Since regexes in Java are normal Java strings, you need to escape the backslash itself, so you need two backslashes e.g. \\.

Fabian Steeg
  • 44,988
  • 7
  • 85
  • 112
  • 1
    this fix also applies to bash – krivar Aug 13 '14 at 11:15
  • 30
    Be aware that whether to escape the backslash depends on how you are supplying the regex. if hardcoded you do need to use: "\\." , if reading from a raw source (e.g. text file) you use only a single backslash: \. – Paul Apr 08 '16 at 14:21
40

Solutions proposed by the other members don't work for me.

But I found this :

to escape a dot in java regexp write [.]

Kael
  • 858
  • 9
  • 21
  • 5
    Same, `\\.` didn't work for me: `\.` complained that `.` doesn't need to be escaped, `\\.` made it think it was `\.` instead of `.`, `\\\.` and the builder threw an error, `[.]` was the only thing that worked. – mithunc Mar 07 '18 at 01:17
  • 1
    @mithunc That's odd, `\\.` inside a string literal gives you `\.` which is what the regex needs to see the dot as a literal dot instead of the any-character matcher. – klaar Sep 04 '18 at 08:08
  • 1
    I have had cases in the past where I had another level of escaping, the result then is \\\\. After the first layer of escaping that gives \\. Then the next layer gives \. And lastly regex converts that to a simple . I don't remember exactly when I needed that but maybe it helps with your problem. – findusl Jul 18 '21 at 14:56
20

Perl-style regular expressions (which the Java regex engine is more or less based upon) treat the following characters as special characters:

.^$|*+?()[{\ have special meaning outside of character classes,

]^-\ have special meaning inside of character classes ([...]).

So you need to escape those (and only those) symbols depending on context (or, in the case of character classes, place them in positions where they can't be misinterpreted).

Needlessly escaping other characters may work, but some regex engines will treat this as syntax errors, for example \_ will cause an error in .NET.

Some others will lead to false results, for example \< is interpreted as a literal < in Perl, but in egrep it means "word boundary".

So write -?\d+\.\d+\$ to match 1.50$, -2.00$ etc. and [(){}[\]] for a character class that matches all kinds of brackets/braces/parentheses.

If you need to transform a user input string into a regex-safe form, use java.util.regex.Pattern.quote.

Further reading: Jan Goyvaert's blog RegexGuru on escaping metacharacters

Tim Pietzcker
  • 328,213
  • 58
  • 503
  • 561
4

Escape special characters with a backslash. \., \*, \+, \\d, and so on. If you are unsure, you may escape any non-alphabetical character whether it is special or not. See the javadoc for java.util.regex.Pattern for further information.

Christoffer Hammarström
  • 27,242
  • 4
  • 49
  • 58
  • Escaping non-special characters needlessly might work in some languages but might fail in others, so it's better to not get into the habit. – Tim Pietzcker Sep 09 '10 at 08:47
  • 1
    This question is specifically about Java though, and http://docs.oracle.com/javase/6/docs/api/java/util/regex/Pattern.html#bs says "A backslash may be used prior to a non-alphabetic character regardless of whether that character is part of an unescaped construct." – Christoffer Hammarström Jul 10 '15 at 09:11
3

Here is code you can directly copy paste :

String imageName = "picture1.jpg";
String [] imageNameArray = imageName.split("\\.");
for(int i =0; i< imageNameArray.length ; i++)
{
   system.out.println(imageNameArray[i]);
}

And what if mistakenly there are spaces left before or after "." in such cases? It's always best practice to consider those spaces also.

String imageName = "picture1  . jpg";
String [] imageNameArray = imageName.split("\\s*.\\s*");
    for(int i =0; i< imageNameArray.length ; i++)
    {
       system.out.println(imageNameArray[i]);
    }

Here, \\s* is there to consider the spaces and give you only required splitted strings.

Deva
  • 1,039
  • 1
  • 14
  • 40
1

I wanted to match a string that ends with ".*" For this I had to use the following:

"^.*\\.\\*$"

Kinda silly if you think about it :D Heres what it means. At the start of the string there can be any character zero or more times followed by a dot "." followed by a star (*) at the end of the string.

I hope this comes in handy for someone. Thanks for the backslash thing to Fabian.

Atspulgs
  • 1,359
  • 11
  • 9
0

If you want to end check whether your sentence ends with "." then you have to add [\.\]$ to the end of your pattern.

0

I am doing some basic array in JGrasp and found that with an accessor method for a char[][] array to use ('.') to place a single dot.

rgm
  • 1
0

I was trying to split using .folder. For this use case, the solution to use \\.folder and [.]folder didn't work.

The following code worked for me

String[] pathSplited = Pattern.compile("([.])(folder)").split(completeFilePath);
Deepak Patankar
  • 3,076
  • 3
  • 16
  • 35