Can someone explain what MATLAB is doing with nul bytes (x00
) in regular expressions?
Examples:
>> regexp(char([0 0 0 0 0 0 0 1 0 0 10 0 0 0]),char([0 0 0 0 46 0 0 10]))
ans =
1 % current
4 % expected
>> regexp(char([0 0 0 1 0 0 0 1 0 0 10 0 0 0]),char([1 0 0 0 46 0 0 10]))
ans =
4 % current
4 % expected
>> regexp(char([0 0 0 1 0 0 0 1 0 0 10 0 0 0]),char([0 0 0 0 46 0 0 10]))
ans =
[] % current
[] % expected
>> regexp(char([0 0 0 0 10 0 0 1 0 0 10 0 0 0]),char([0 0 0 0 46 0 0 10]))
ans =
1 % current
[] % expected
>> regexp(char([0 0 0 0 0 0 0 1 0 0 10 0 0 0]),char([1 0 0 0 46 0 0 10]))
ans =
[] % current
[] % expected
The answer might simply be, MATLAB regular expression isn't meant to handle non printable characters, but I would assume it would error if this was the case.
EDIT: The 46 is expected to be '.'
as in the regex wildcard.
EDIT2:
>> regexp(char([0 0 0 0 50 0 0 100 0 0 90 0 0 0]),char([0 0 46 0 0 90]))
ans =
1 9
I realized it could have been 10 being a special character so this one has only printable and nul bytes. I would expect this one to only match 9 because the fifth character 50
does not match 0
.