0

I am trying to extract data from a simple html page that takes temperature reading from arduino I have managed to get to the point where I get the string with the temperature reading which is:

'Temperature in Celsius: \r\n 23.20\r\n*C'

but I cannot work out how to extract the temperature float from the string, any suggestions? please bear in mind that the resulting temperature changes as the arduino take live reading...

from lxml import html
import requests
page = requests.get('http://192.168.1.103:180')
tree = html.fromstring(page.content)
extract = tree.xpath('/html/body/h3[1]/text()')
print extract
Moinuddin Quadri
  • 46,825
  • 13
  • 96
  • 126

1 Answers1

2

One option would be to apply a regular expression:

In [1]: import re

In [2]: s = 'Temperature in Celsius: \r\n 23.20\r\n*C'

In [3]: re.search(r"\d+\.\d+", s).group(0)
Out[3]: '23.20'

where \d+ matches one or more consecutive digits, \. is a literal dot.

Or, you can split by : and "strip" the unneeded part:

In [4]: s.split(":")[-1].strip(" \r\n*C")
Out[4]: '23.20'

Note that xpath() method in lxml returns a list, don't forget to get the desired string from it.

alecxe
  • 462,703
  • 120
  • 1,088
  • 1,195