0

I have scoured these pages for days without success, so I am hoping this is not a duplicate. If so I apologize. I have a device on a local network that provides a data read out in HTML that is updated live. So far my BeautifulSoup and URLLIB2 attempts at parsing this data has been unsuccessful. Any help would be appreciated.

This is the source code, with the data of interested encircled:

Image1

This if the resultant output:

Image2

from bs4 import BeautifulSoup
import re
import urllib2
from urllib import urlopen
url = 'http://192.168.1.2/index.html#home-view'
#___________________________________________________________________
usock = urllib2.urlopen(url)
data = usock.read()
usock.close()
soup = BeautifulSoup(data, "html.parser")
result = soup.findAll('p', {'class':'gas-conc'})
print result

SOLVED!: Thank you for the assistance. With Selenium I was able to painfully scrape out this data. However I had to use the BS 'beautify' function on the source code and manually count out which characters to splice out.

R.Arch
  • 1
  • 2
  • 1
    Could you please post the source code in text instead of an image? It's better for people trying to help – Vinícius Figueiredo Jul 28 '17 at 21:13
  • By the way, if the information is being changed "live" it could be that the information is being generated by JavaScript, and `requests` doesn't load JS, you should have a look at [`selenium`](http://selenium-python.readthedocs.io/), I [answered a question](https://stackoverflow.com/questions/45305595/how-to-extract-the-span-tag-contents-using-the-beautiful-soup/45315393#45315393) a few days ago that could be of help if you are going through this way. – Vinícius Figueiredo Jul 28 '17 at 21:18
  • Please [edit] your question and include source code and output in text form. – phd Jul 28 '17 at 21:37
  • Welcome to the site: you may want to read [help/on-topic], [ask] and [mcve], and re-word your question accordingly. – boardrider Jul 30 '17 at 21:52

1 Answers1

1

I'm 90% sure that you wouldn't get these data unless you managed to render Javascript somehow.

Check out this post for more info in how to make that happens.

In a nutshell, you can use:

  1. selenium
  2. PyQt5
  3. Dryscrape
0xMH
  • 1,825
  • 20
  • 26