1

I currently use this regex:

(\d+)

the problem that i can get 2 strings:

"2112343 and alot of 4.99"

OR

"4.99 and alot of 2112343 "

I get this from both:

[2112343, 4, 99]

I need to get only the 2112343... How can i achieve this?

Danpe
  • 18,668
  • 21
  • 96
  • 131

7 Answers7

7

Using lookaround, you can restrict your capturing to only digits which are not surrounded by other digits or decimal points:

(?<![0-9.])(\d+)(?![0-9.])

Alternatively, if you want to only match stand-alone numbers (e.g. if you don't want to match the 123 in abc123def):

(?<!\S)\d+(?!\S)
Amber
  • 507,862
  • 82
  • 626
  • 550
  • 1
    Would not be matched, intentionally. – Amber Jun 30 '12 at 19:35
  • @pst `1` and `4` are in `[0-9.]` and thus disallowed from being to the left and right of the matched group due to the lookarounds. – Amber Jun 30 '12 at 19:38
  • It matches `hello 1234 world` -> `1234` because whitespace characters are not in `[0-9.]` and thus satisfy the lookarounds. Do you know how lookarounds work? http://www.regular-expressions.info/lookaround.html – Amber Jun 30 '12 at 19:42
  • 1
    RegexPal does not use C# regex; it uses JavaScript regex. The two are not the same. Specifically, JavaScript regex doesn't support negative lookbehind. – Amber Jun 30 '12 at 19:46
  • if i want the a regex that will get 123 from this: "p123" but not get 1, 23 from this: "1.23" ? – Danpe Jul 03 '12 at 17:20
  • To preserve bond-007 and 100-monkeys add dashes next to those dots, and so on. I get it now, the left side says only match if any of these precede the thing of interest \d+, and the right side applies only to what comes after – gseattle Mar 15 '21 at 04:05
1

try this

(?<!\S)\d+(?!\S)

this will only match integers

Mayank
  • 8,777
  • 4
  • 35
  • 60
1

If I understand you right, you want to match those numbers with a point inside, too, but dont want to have these in the resulting collection.

I would approach this via 2 steps, first select all numbers, also those with a dot:

(\d+(?:\.\d+)*)

then filter out everything that is not purely numbers, and use your first regex and apply it to each item of the resulting collection from the first step:

(\d+)
Philip Daubmeier
  • 14,584
  • 5
  • 41
  • 77
  • I agree with this approach; no point trying to come up with an overly complicated regular expression... However I would use `[\d.]+` as the initial selector. –  Jun 30 '12 at 19:34
  • @pst: the question is if we want to treat "123." as wanted or not. Your selector would match it, and throw it away in the second step. My selector would match "123" and finally keep it. The op should decide here what fits the problem best... – Philip Daubmeier Jun 30 '12 at 19:36
1

As I posted in my comment:

(?:^| )(\d+)(?:$| )

It will match all "words" that are entirely composed of digits(a word being a string of non-space characters surrounded by space characters and or the beginning/end of the string.)

Joel Cornett
  • 24,192
  • 9
  • 66
  • 88
0

Try this

(?<![0-9.])\d+(?![0-9.])

It usees the pattern

(?<!prefix)position(?!suffix)

where (?<!prefix)position means: Match position not following prefix.

and position(?!suffix) means: Match position not preceeding suffix.

finally [0-9.] means: Any digit or the decimal point.

Olivier Jacot-Descombes
  • 104,806
  • 13
  • 138
  • 188
0

The catch is always in what "standalone" means. Here are several solutions depending on that meaning.

  1. Match digit strings not enclosed with other digits: (?<!\d)\d+(?!\d) (note this is equal to \d+, but (?<!\d)\d{4}(?!\d) will start making sense when you need to match only four-digit strings). See the regex demo.

  2. Match digit strings only enclosed with whitespace or at the start/end of the string: (?<!\S)\d+(?!\S). See the regex demo.

  3. Match digit strings as whole words: \b\d+\b (note that word boundaries match in a lot of contexts, and will also match parts of decimal numbers). See the regex demo.

  4. Match whole integers, not parts of decimal numbers (assuming that a dot is used as a decimal separator): (?<!\d\.)(?<!\d)\d+(?!\.?\d). See the regex demo.

  5. Matching digit only strings: ^\d+$. See the regex demo.

There can be more variations of these patterns, just make sure you match the right "standalone" meaning.

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
-1
>>>r = re.match("\d+", "23423 in 3.4")
>>>r.group(0)
'23423'