Backslash is special character in string literals - we can use it to create \n
or escape "
like \"
.
But backslash is also special in regular expression engine - for instance we can use it to use default character classes like \w
\d
\s
.
So if you want to create string which will represent regex/text like \w
you need to write it as "\\w"
.
If you want to write regex which will represent \
literal then text representing such regex needs to look like \\
which means String representing such text needs to be written as "\\\\"
.
In other words we need to escape backslash twice:
- once in regex
\\
- and once in string
"\\\\"
.
If you want to pass to regex engine literal which will represent tab then you don't need to escape backslash at all. Java will understand "\t"
string as string representing tab character and you can pass such string to your regex engine without problems.
For our comfort regex engine in Java interprets text representing \t
(also \r
and \n
) same way as string literals interpret "\t"
. In other words we can pass to regex engine text which will represent \
character and t
character and be sure that it will be interpreted as representation of tab character.
So code like split("\t")
or split("\\t")
will try to split on tab.
Code like split("\\\\t")
will try to split text not on tab character, but on \
character followed by t
. It happens because "\\\\"
as explained represents text \\
which regex engine sees as escaped \
(so it is treated as literal).