24

I need to find the text of all the one-digit number.

My code:

$string = 'text 4 78 text 558 my.name@gmail.com 5 text 78998 text';
$pattern = '/ [\d]{1} /';

(result: 4 and 5)

Everything works perfectly, just wanted to ask it is correct to use spaces? Maybe there is some other way to distinguish one-digit number.

Thanks

lolalola
  • 3,773
  • 20
  • 60
  • 96
  • You are missing some special cases here. When number is on the beginning, at the end and when it is only one digit in a string. – abc667 Feb 26 '13 at 21:01

5 Answers5

31

First of all, [\d]{1} is equivalent to \d.

As for your question, it would be better to use a zero width assertion like a lookbehind/lookahead or word boundary (\b). Otherwise you will not match consecutive single digits because the leading space of the second digit will be matched as the trailing space of the first digit (and overlapping matches won't be found).

Here is how I would write this:

(?<!\S)\d(?!\S)

This means "match a digit only if there is not a non-whitespace character before it, and there is not a non-whitespace character after it".

I used the double negative like (?!\S) instead of (?=\s) so that you will also match single digits that are at the beginning or end of the string.

I prefer this over \b\d\b for your example because it looks like you really only want to match when the digit is surrounded by spaces, and \b\d\b would match the 4 and the 5 in a string like 192.168.4.5

To allow punctuation at the end, you could use the following:

(?<!\S)\d(?![^\s.,?!])

Add any additional punctuation characters that you want to allow after the digit to the character class (inside of the square brackets, but make sure it is after the ^).

Andrew Clark
  • 202,379
  • 35
  • 273
  • 306
21

Use word boundaries. Note that the range quantifier {1} (a single \d will only match one digit) and the character class [] is redundant because it only consists of one character.

\b\d\b
kjetilh
  • 4,821
  • 2
  • 18
  • 24
7

Search around word boundaries:

\b\d\b

As explained by the others, this will extract single digits meaning that some special characters might not be respected like "." in an ip address. To address that, see F.J and Mike Brant's answer(s).

0

It really depends on where the numbers can appear and whether you care if they are adjacent to other characters (like . at the end of a sentence). At the very least, I would use word boundaries so that you can get numbers at the beginning and end of the input string:

$pattern = '/\b\d\b/';

But you might consider punctuation at the end like:

$pattern = '/\b\d(\b|\.|\?|\!)/';
Mike Brant
  • 70,514
  • 10
  • 99
  • 103
0

If one-digit numbers can be preceded or followed by characters other than digits (e.g., "a1 cat" or "Call agent 7, pronto!") use

(?<!\d)\d(?!\d)

Demo

The regular expression reads, match a digit (\d) that is neither preceded nor followed by digit, (?<!\d) being a negative lookbehind and (?!\d) being a negative lookahead.

Cary Swoveland
  • 106,649
  • 6
  • 63
  • 100
  • Just after posting I noticed my regex was suggested by @js2010 in a comment below. I've retained my answer to provide a rationale for using that regex. – Cary Swoveland Oct 07 '21 at 22:36