Issue parsing a number on webpage via BeautifulSoup

Question

I would like to parse a number from this website dashboard. The number is located beneath "Organic Search"

Using a simple cmd-F on the soup, I eventually realized that my soup doesn't contain this number at all. It would be great to hear suggestions on why this is.

宏杰李 · Accepted Answer · 2017-02-02T04:53:42.567

1

This page is rendered by JavaScrip, the response will be:

the real data is in this url:

import requests

r = requests.get('https://us.backend.semrush.com/?key=adb79c4ec6282f461fb0e2e67aa50949&action=report&type=url_organic&currency=usd&url=https%3A%2F%2Fwww.yelp.com%2Fbiz%2Fplayground-2-0-santa-ana-3&_=1486008342774')
data = r.json()
data['organic']['traffic']

edited Feb 02 '17 at 04:53

answered Feb 02 '17 at 04:01

宏杰李

11,820
2
28
35

1

you can call it without `jsoncallback=jQuery21407727922755626626_1486008342773` and you get pure JSON and then you can use `json` module. – furas Feb 02 '17 at 04:45
@furas WoW, Thanks. – 宏杰李 Feb 02 '17 at 04:51
1

`jsoncallback` is very popular method to send data and automatically execute function assigned to `jsoncallback` - it is called `JSONP` - see [What is JSONP all about?](http://stackoverflow.com/questions/2067472/what-is-jsonp-all-about) – furas Feb 02 '17 at 04:55
Thanks your answer helped me understand the problem more clearly, however what is the proper way to parse this particular page: 'https://us.backend.semrush.com/?key=adb79c4ec6282f461fb0e2e67aa50949&action=report&type=url_organic&currency=usd&url=https%3A%2F%2Fwww.yelp.com%2Fbiz%2Fplayground-2-0-santa-ana-3&_=1486008342774' I am considering using PhantomJS because you mentioned js, and there seems to be some way to do it using BeautifulSoup 4 as well. I am still a beginner with DOM and web parsing in general, thanks – wip Feb 04 '17 at 02:25

Issue parsing a number on webpage via BeautifulSoup

1 Answers1