1

new to stackoveflow so i was using beautiful soup to extract data from an article in 'techcrunch.com' for some independent research . i seemed to extract most data with ease but ran into trouble while trying to obtain data from the tiny bubbles above social-networking icons that depict the number of shares of that article via that media.

Regardless of the number of shares by any social network ....the value returned to me is 0.

from BeautifulSoup import BeautifulSoup
import urllib2
url="http://techcrunch.com/2015/10/11/the-other-ag-sector-problem-that-big-data-can-solve/"
page=urllib2.urlopen(url)
soup = BeautifulSoup(page.read())
data=soup.find('div',{'class':'bubble total-facebook'})
print data.text

Result in cmd - 0 (but current shares on facebook is 171)...please help !

ap17
  • 11
  • 2

1 Answers1

0

That's because the number is loaded dynamically by Javascript. If you view the page source in browser, you will see the div of class "bubble total-facebook" is really holding a text of "0", which is also what BeautifulSoup sees.

A headless browser with javascript support may help. I think you can refer to this question:

Headless Browser for Python (Javascript support REQUIRED!)

Community
  • 1
  • 1
Flickerlight
  • 904
  • 8
  • 18