32

How do I extract a double value from a string using regex.

import re

pattr = re.compile(???)    
x = pattr.match("4.5")      
kirelagin
  • 13,248
  • 2
  • 42
  • 57
Berlin Brown
  • 11,504
  • 37
  • 135
  • 203

5 Answers5

62

A regexp from the perldoc perlretut:

import re
re_float = re.compile("""(?x)
   ^
      [+-]?\ *      # first, match an optional sign *and space*
      (             # then match integers or f.p. mantissas:
          \d+       # start out with a ...
          (
              \.\d* # mantissa of the form a.b or a.
          )?        # ? takes care of integers of the form a
         |\.\d+     # mantissa of the form .b
      )
      ([eE][+-]?\d+)?  # finally, optionally match an exponent
   $""")
m = re_float.match("4.5")
print m.group(0)
# -> 4.5

To extract numbers from a bigger string:

s = """4.5 abc -4.5 abc - 4.5 abc + .1e10 abc . abc 1.01e-2 abc 
       1.01e-.2 abc 123 abc .123"""
print re.findall(r"[+-]? *(?:\d+(?:\.\d*)?|\.\d+)(?:[eE][+-]?\d+)?", s)
# -> ['4.5', '-4.5', '- 4.5', '+ .1e10', ' 1.01e-2',
#     '       1.01', '-.2', ' 123', ' .123']
jfs
  • 399,953
  • 195
  • 994
  • 1,670
23

Here's the easy way. Don't use regex's for built-in types.

try:
    x = float( someString )
except ValueError, e:
    # someString was NOT floating-point, what now?
S.Lott
  • 384,516
  • 81
  • 508
  • 779
  • Actually, this is also the safest way. Consider some wrong input, like `0..1`, `0.0.02`, it's very difficult for regex to recognise it. The worse thing is, it will pretend it's correct and produce some wrong answer. – dspjm Nov 15 '16 at 06:07
  • 1
    Technically correct, but the question explicitly specifies regexp. – villasv Jan 07 '17 at 15:48
20

For parse int and float (point separator) values:

re.findall( r'\d+\.*\d*', 'some 12 12.3 0 any text 0.8' )

result:

['12', '12.3', '0', '0.8']
iqmaker
  • 2,162
  • 25
  • 24
1

a float as regular expression in brute force. there are smaller differences to the version of J.F. Sebastian:

import re
if __name__ == '__main__':
  x = str(1.000e-123)
  reFloat = r'(^[+-]?\d+(?:\.\d+)?(?:[eE][+-]\d+)?$)'
  print re.match(reFloat,x)

>>> <_sre.SRE_Match object at 0x0054D3E0>
nuggetier
  • 172
  • 2
  • 6
0

Just to note that none of these answers cover the interesting edge cases such as "inf", "NaN", "-iNf", "-NaN", "1e-1_2_3_4_5_6", etc.

(inspired by Eric's answer here Checking if a string can be converted to float in Python)

Amnon Harel
  • 151
  • 1
  • 4