I wrote a parser for a custom (subset of) BBCode
in Javascript
and now I translated it to C#
. This custom BBCode
allows parsing line by line so I have regex
allowing me to "pop" the first line from the BBCode
string:
/(^.*$|^.*\r?\n)/
It matches an empty string. The first part ^.*$
matches a simple string like "Simple string"
(single line without CrLf
at the end).
The second part ^.*\r?\n
matches the first line ending with CrLf
.
This works perfect in Javascript
. But while running the unit tests in C#
I noticed a difference.
Assume we have "line1\n"
as input.
The regex
in Javascript
will match it as follows:
^.*$
won't match because .
is any symbol except CrLf
and we have \n
at the end.
^.*\r?\n
will match as we have string starting with 0 or more symbols and \n
at the end.
Now in C#
it works different:
^.*$
will match (why?), but only the line1
. Thus the whole /(^.*$|^.*\r?\n)/
will also match only line1
an the \n
goes missing.
Could someone please explain? Is there a way to force C# regex
to behave like the Javascript regex
in the sense described above?
The simplest workaround would be to change the order in the pattern : /(^.*$|^.*\r?\n)/
-> /(^.*\r?\n|^.*$)/
and so the problem will be solved ...,
but I still would like to know the reason behind that difference.
Click here for the C#
test code ...
For Javascript
see below:
const first_line_pattern = /(^.*$|^.*\r?\n)/
const single_string_pattern = /^.*$/
const line_pattern = /^.*\r?\n/
const input4 = "line1\n"
function log(pattern) {
let m4 = input4.match(pattern)
console.log('~~~~~~~~' + pattern.toString() + '~~~~~~~~~')
console.log("'line1\\n':':" + (m4 != null) + ":value: /" + (m4 ? m4[0] : 'no match') + "/")
}
log(first_line_pattern)
log(single_string_pattern)
log(line_pattern)
Thank you for your time!