I am trying to understand how this web site is working. There is an input form where you can provide a url. This form returns information retrieved from another site (Youtube). So:
My first and more interesting question is if anybody has any idea how this site retrieve the entire corpus of statements?
Alternatively, since now I am using the following code:
from BeautifulSoup import BeautifulSoup import json urlstr = 'http://www.sandracires.com/en/client/youtube/comments.php?v=' + videoId + '&page=' + str(npage) url = urllib2.urlopen(urlstr) content = url.read() soup = BeautifulSoup(content) #parse json newDictionary=json.loads(str(soup)) #print example print newDictionary['list'][1]['username']
However, I can not iterate in all pages (which is not happening when I to that manually). I have placed
timer.sleep(30)
below json but without success. Why is that happening?
Thanks!
Python 2.7.8