How to write a regular expression to get floating point numbers in Python?

Question

How to write the regular expression to get a floating point number in python. I want to get 55.97. from <td nowrap="nowrap">55.97</td>. So I gave

newsecond_row_data = (re.search('(?<=>)\d+|\d+.\d+',second_row_data[a]))
newsecond_row_data.group(0)

print newsecond_row_data.group(0)

but it gave 55 not 55.97. Plz hlp me

Thank you

Note that you should not parse HTMl with regexes and string functions. See http://stackoverflow.com/a/1732454/113586 — wRAR, Feb 09 '12 at 08:49
Are all of these legal for you: `1.`, `.1`, `1`, `-1.1`, `1e-1`? — Tim Pietzcker, Feb 09 '12 at 08:52

score 7 · Answer 1 · answered Feb 09 '12 at 08:53

If you want to extract data from HTML or XML take a look at the various parsers available. For this particular case, you can extract the number very easily:

>>> from xml.etree import ElementTree
>>> element = ElementTree.fromstring('<td nowrap="nowrap">55.97</td>')
>>> element.text
'55.97'
>>>

score 0 · Answer 2 · answered Feb 09 '12 at 08:47

0

newsecond_row_data = re.search('\d+\.?\d*', second_row_data[a])
print newsecond_row_data.group(0)

answered Feb 09 '12 at 08:47

Niclas Nilsson

5,691
3
30
43

doug · Answer 3 · 2012-02-09T09:28:31.510

0

import re

ptn = r'[-+]?([0-9]*\.?[0-9]+)'
pat_obj = re.compile(ptn)

m = pat_obj.search(some_str)
if m:
    print(m.group(0))

if you have more than one floating point per string, then use findall instead of match:

>>> s = '3dfrtg45.2trghyui8erdftgy77.431dser'

>>> pat_obj = re.compile(ptn)
>>> v = pat_obj.findall(s)
>>> v
  ['3', '45.2', '8', '77.431']

edited Feb 09 '12 at 09:28

answered Feb 09 '12 at 08:48

doug

69,080
24
165
199

score 0 · Accepted Answer · answered Feb 09 '12 at 08:55

0

newsecond_row_data = (re.search('(?<=>)\d+.\d+|\d+',second_row_data[a]))
newsecond_row_data.group(0)

The reason your pattern isn't working is because it sees '55', finds a match and stops further search.

Then again, I would advice not to use regex and use an XML processing library to extract text out of HTML tags (see Sudhir's answer).

answered Feb 09 '12 at 08:55

rubayeet

9,269
8
46
55

Would you please be so kind to put in bold your advice for the *"I want to parse xml with regexp"* to come? – Rik Poggi Feb 09 '12 at 10:25

How to write a regular expression to get floating point numbers in Python?

4 Answers4