-2

I'm dealing with a regex in Java that should capture all occurrences of decimal numbers with no leading zeros.

Example:

The cat is .75 high but the dog id 3.67 high instead. All animals aren't higher that .87.

I expect to capture only .75 and .87 as they are decimals with whatever numbers of digits, but without the leading zero. I should not capture 3.67 instead.

I tried capturing it with word boundaries on both sides:

\b\.\d+\b

But the word boundary on the left side of this doesn't work well. Without the word boundary, it matches 3.67 too.

What would be the correct regex syntax to achieve this requirement?

Dharman
  • 30,962
  • 25
  • 85
  • 135

1 Answers1

3

You want to match the opposite of \b at the start of the pattern. For this, there is uppercase \B, which matches where \b doesn't.

Basically, you're looking for a decimal point that is not at a word boundary (because there would be a word boundary between the decimal point and any other numbers), followed by numbers, followed by a word boundary.

\B matches at every position where \b does not. Effectively, \B matches at any position between two word characters as well as at any position between two non-word characters.

\B\.\d+\b

See this demo at regex101

Ryan M
  • 18,333
  • 31
  • 67
  • 74
bobble bubble
  • 16,888
  • 3
  • 27
  • 46
  • Gosh well, yes, I'm testing and actually I think so, but if I want to have whole word only both on left and right side, why on right side \b works and it doesn't on left side if it is actually same condition ? (E.g.: have a space or an end of line) ? – Stefano Falconetti Sep 07 '22 at 11:10
  • @StefanoFalconetti On the right side there is a digit, which belongs to word-characters thus the word boundary after `\d+` matches, if there is not another word character right to it. `\B` would match between two word characters or two non-word characters like you want it on the left side where there is a dot and you don't want a digit (word-character) before it. – bobble bubble Sep 07 '22 at 11:13