I am fairly new to using regular expressions and I am stuck on a problem that I am trying to solve. I have issues understanding what's going on and I hope that someone can hint me in the right direction.
What I am trying to achieve:
To avoid duplicates in the view, I want to check if an attribute name contains the respective attribute unit. For example if $attribute['name'] = "Cutting speed (in m/Min.)"
and attribute['unit'] = "m/min"
the attribute unit should not be displayed as it is already mentioned in the name.
How I am trying to achieve this:
I am checking for the attribute unit by using the following regular expression: ~\b' . attribute['unit'] . '\b~i'
This works well in for the above mentioned example, but not so well if the unit is a special character, like %
or "
, for instance.
The Problems
While testing for the special character issue I came accross the following phenomenon:
if I use this regex /\b%\b/
it behaves not as expected and matches the %
in bla%bla
but not the %
if it is preceded or followed by a space: https://regex101.com/r/56iYEI/3
It seems like the % turns the behavior of the regex to its opposite. I tested with other "special characters" as well (" and &), and they seem to have the same effect.
I was directed to this question (Regular Expression Word Boundary and Special Characters) before and read the answers. I now understand that \b
checks for word boundaries. But it is still unclear to me why it behaves the way it does as soon as a %
or "
turns up.
The questions
- How come a
%
turns this checking for word boundaries by\b
around? - How can I achieve my goal to match for alphanumeric units as well as for special character units, like
%
or"
?
Looking forward to any hints. Thanks in advance!