What are non-word boundary in regex (\B), compared to word-boundary?
2 Answers
A word boundary (\b
) is a zero width match that can match:
- Between a word character (
\w
) and a non-word character (\W
) or - Between a word character and the start or end of the string.
In Javascript the definition of \w
is [A-Za-z0-9_]
and \W
is anything else.
The negated version of \b
, written \B
, is a zero width match where the above does not hold. Therefore it can match:
- Between two word characters.
- Between two non-word characters.
- Between a non-word character and the start or end of the string.
- The empty string.
For example if the string is "Hello, world!"
then \b
matches in the following places:
H e l l o , w o r l d !
^ ^ ^ ^
And \B
matches those places where \b
doesn't match:
H e l l o , w o r l d !
^ ^ ^ ^ ^ ^ ^ ^ ^ ^

- 811,555
- 193
- 1,581
- 1,452
-
6Nice one. In my experience, *explaining* word boundaries is considerably more difficult than *using* them. – Alan Moore Dec 27 '10 at 23:35
-
5I have not seen such a lucid explanation of word boundaries before. Great one! – Salil Mar 01 '12 at 23:29
-
For the `\B` example, the label between `start of string` and `H` is missing. Nice explanation otherwise. – ericyan3000 May 29 '22 at 05:03
The basic purpose of non-word-boundary
is to created a regex that says:
if we are at the beginning/end of a
word char
(\w
=[a-zA-Z0-9_]
) make sure the previous/next character is also aword char
,e.g.:
"a\B."
~"a\w"
:"ab"
,"a4"
,"a_"
, ... but not"a "
,"a."
if we are at the beginning/end of a
non-word char
(\W
=[^a-zA-Z0-9_]
) make sure the previous/next character is also anon-word char
,e.g.:
"-\B."
~"-\W"
:"-."
,"- "
,"--"
, ... but not"-a"
,"-1"
For word-boundary
it's similar but instead of making sure that the adjacent characters are of the same class (word char
/non-word car
) they need to differ, hence the name word's boundary
.

- 2,633
- 2
- 19
- 15