I currently use this regex:
(\d+)
the problem that i can get 2 strings:
"2112343 and alot of 4.99"
OR
"4.99 and alot of 2112343 "
I get this from both:
[2112343, 4, 99]
I need to get only the 2112343... How can i achieve this?
I currently use this regex:
(\d+)
the problem that i can get 2 strings:
"2112343 and alot of 4.99"
OR
"4.99 and alot of 2112343 "
I get this from both:
[2112343, 4, 99]
I need to get only the 2112343... How can i achieve this?
Using lookaround, you can restrict your capturing to only digits which are not surrounded by other digits or decimal points:
(?<![0-9.])(\d+)(?![0-9.])
Alternatively, if you want to only match stand-alone numbers (e.g. if you don't want to match the 123 in abc123def
):
(?<!\S)\d+(?!\S)
If I understand you right, you want to match those numbers with a point inside, too, but dont want to have these in the resulting collection.
I would approach this via 2 steps, first select all numbers, also those with a dot:
(\d+(?:\.\d+)*)
then filter out everything that is not purely numbers, and use your first regex and apply it to each item of the resulting collection from the first step:
(\d+)
As I posted in my comment:
(?:^| )(\d+)(?:$| )
It will match all "words" that are entirely composed of digits(a word being a string of non-space characters surrounded by space characters and or the beginning/end of the string.)
Try this
(?<![0-9.])\d+(?![0-9.])
It usees the pattern
(?<!prefix)position(?!suffix)
where (?<!prefix)position
means: Match position not following prefix.
and position(?!suffix)
means: Match position not preceeding suffix.
finally [0-9.]
means: Any digit or the decimal point.
The catch is always in what "standalone" means. Here are several solutions depending on that meaning.
Match digit strings not enclosed with other digits: (?<!\d)\d+(?!\d)
(note this is equal to \d+
, but (?<!\d)\d{4}(?!\d)
will start making sense when you need to match only four-digit strings). See the regex demo.
Match digit strings only enclosed with whitespace or at the start/end of the string: (?<!\S)\d+(?!\S)
. See the regex demo.
Match digit strings as whole words: \b\d+\b
(note that word boundaries match in a lot of contexts, and will also match parts of decimal numbers). See the regex demo.
Match whole integers, not parts of decimal numbers (assuming that a dot is used as a decimal separator): (?<!\d\.)(?<!\d)\d+(?!\.?\d)
. See the regex demo.
Matching digit only strings: ^\d+$
. See the regex demo.
There can be more variations of these patterns, just make sure you match the right "standalone" meaning.
>>>r = re.match("\d+", "23423 in 3.4")
>>>r.group(0)
'23423'