I have the following code in TCL:
"\\*05.|__|##|.T|__|__|"
trying to match the following output:
*05 |__|##| T|__|__|
and it matches.
but if the output is:
*05 |__|##|__|__|__|
it also matches, what is the problem, and how to fix it?
I have the following code in TCL:
"\\*05.|__|##|.T|__|__|"
trying to match the following output:
*05 |__|##| T|__|__|
and it matches.
but if the output is:
*05 |__|##|__|__|__|
it also matches, what is the problem, and how to fix it?
The character |
is a special character and is used to mean 'or' in regexp. What you need to do is escape it.
"\\*05.\\|__\\|##\\|.T\\|__\\|__\\|"
Now, to avoid all those double escaping, just use braces!
regexp {\*05.\|__\|##\|.T\|__\|__\|} $string
If you wanted a more in-depth explanation, you should have asked. I don't bite! xD
When you use:
regexp "\\*05.|__|##|.T|__|__|" "*05 |__|##| T|__|__|"
Tcl is calling the command regexp
and the expression is first evaluated (it is first processed before being taken to the actual command regexp
and what is sent to regexp
is:
\*05.|__|##|.T|__|__|
Now, since |
means or in regexp
, the command will evaluate it as:
One literal character *
, then 05
, then any one character (except newline), OR
two _
, OR
two #
, OR
any character followed by T
, OR
two _
, OR
two _
, OR
nothing
It then compares each of the above with the string you wanted to match, *05 |__|##| T|__|__|
.
Step 1: is there *05.
in the string? Yes, "*05 " is in the string and thus matches, so it returns 1.
When you compare it to *05 |__|##|__|__|__|
, the same thing happen:
Step 1: is there *05.
in the string? Yes, "*05 " is in the string and thus matches, so it returns 1.
With double escaping, the string that goes to the regexp after any evalutation is:
\*05.\|__\|##\|.T\|__\|__\|
The regexp then reads it as:
One literal *
character, then 05
, then any character, then a literal |
, two _
, a literal |
, two #
, a literal |
, any character, a T
, a literal |
, two _
, a literal |
, two _
and a literal |
.
There is only one option, thus when it compares to *05 |__|##| T|__|__|
, it matches.
When it will compare it to *05 |__|##|__|__|__|
, when the regex will check T
, it won't find a match.
The braces prevent the expression to be evaluated before it is sent to the regexp procedure. Thus, the expression will remain the same as you have typed it out. If you put:
{\\*05.\\|__\\|##\\|.T\\|__\\|__\\|}
The regexp will receive \\*05.\\|__\\|##\\|.T\\|__\\|__\\|
and interpret is as a \
0 or more times, then 05
, then any character, a \
, OR, etc....
This is why you don't double escape with braces:
{\*05.\|__\|##\|.T\|__\|__\|}
And the expression that regexp will receive is \*05.\|__\|##\|.T\|__\|__\|
, which is the one you had after the "\\*05.\\|__\\|##\\|.T\\|__\\|__\\|"
was processed earlier.