Crawling a page using LazyLoader with Python BeautifulSoup

Question

I am toying around with BeautifulSoup and I like it so far.

The problem is the site I am trying to scrap has a lazyloader... And it only scraps one part of the site.

Can I have a hint as to how to proceed? Must I look at how the lazyloader is implemented and parametrize anything else?

How are you currently downloading the content of the webpage? You can look at this question for answers to scraping pages with javascript: http://stackoverflow.com/questions/3362859/scraping-websites-with-javascript-enabled — Joe, Feb 15 '13 at 03:29

score 1 · Accepted Answer · answered Apr 27 '13 at 11:37

It turns out that the problem itself wasn't BeautifulSoup, but the dynamics of the page itself. For this specific scenario that is.

The page returns part of the page, so headers need to be analysed and sent to the server accordingly. This isn't a BeautifulSoup problem itself.

Therefore, it is important to take a look at how the data is loaded on a specific site. It's not always a "Load a whole page, process the whole page" paradigm. In some cases, you need to load part of the page and send a specific parameter to the server in order to keep loading the rest of the page.

Crawling a page using LazyLoader with Python BeautifulSoup

1 Answers1