-1

Why in this case left \b performs as left \B?

\b\$[0-9]+(\.[0-9][0-9])?\b

This pattern should omit phrases such a$99.99 because of performance of left \b.

\b detects phrases which have not been bounded with letters, digits or underscore. But not! I examined it in regex101. as you see it detects phrases such tyh666.8 but doesn't detect $99

However, right \b performs completely correctly!

Surprisingly, I changed it to left \B it worked!

\B detects phrases which have been bounded with letters, digits or underscore. But here it works as a left \b! and I have no idea why?!

As you see it detects phrases which have not been bounded with letters, digits or underscore

Bohemian
  • 412,405
  • 93
  • 575
  • 722
R.Z
  • 3
  • 1
  • `\b\$[0-9]+(\.[0-9][0-9])?\b` matches `a$99.99`: The `\b` is matching between the `a` and `$`. `$99` does **not** match because there is no word boundary between a space and `$`. A `\b` matches *between* a word char and a non-word char, and between a non-word char and a word char (where a "word char" is `0-9a-zA-Z_]`). Read [Difference between `\b` and `\B` in regex](https://stackoverflow.com/a/6664167/256196). – Bohemian Apr 27 '23 at 06:31
  • From [this answer](https://stackoverflow.com/a/7606358/4225384): `The word boundary \b matches on a change from a \w (a word character) to a \W a non word character, or from \W to \w` – qrsngky Apr 27 '23 at 06:33

1 Answers1

0

Your understanding of \b is incorrect. It matches positions where there are "word" characters on one side, and "not word" characters on the other. $ is not a word character, so it will only match where $ is immediately preceded by a word character (alphanumerics, plus in some implementations e.g. @ and _)

tripleee
  • 175,061
  • 34
  • 275
  • 318
  • I'm pretty sure *all* implementations of `\b` agree that a word char is `[0-9a-zA-Z_]`. Can you quote references to any variations to this? – Bohemian Apr 27 '23 at 06:36
  • My memories are vague, but IIRC some programming languages have hacked the "word" class to coincide with their "identifier" or "token" class for their own convenience. – tripleee Apr 27 '23 at 06:37