How I can download the content of a dynamic page?

Question

I'm using scrapy to download the content of this page:

http://www.bbb.org/atlanta/business-reviews/fence-contractors/summit-fence-in-acworth-ga-27501223/customer-reviews?cacheit=y

but when I look in

response.body

The content of the reviews isn't there, I refer to the content of 'Negative experience (1 review)' the says: " Good luck using this company. Brian was surly and rude to me and my husband. After much discussion about what we wanted for..."

scrapy shell 'http://www.bbb.org/central-texas/business-reviews/concrete-stamped-and-decorative/artistic-impressions-concrete-staining-in-new-braunfels-tx-90080290/Customer-Reviews' -s USER_AGENT='Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/46.0.2490.80 Safari/537.36'

content  = response.body

content.find('Good luck using this company')

It returns: -1

How can't I get that data?

Maybe helpful: http://stackoverflow.com/questions/30345623/scraping-dynamic-content-using-python-scrapy — erip, Dec 31 '15 at 18:33

alecxe · Accepted Answer · 2015-12-31T18:42:19.313

Reviews are separately loaded by requesting the /ReadReviews endpoint and providing the page and the type of experience. For instance, in the provided example, it would be:

http://www.bbb.org/central-texas/business-reviews/concrete-stamped-and-decorative/artistic-impressions-concrete-staining-in-new-braunfels-tx-90080290/ReadReviews?page=1&exp=-1

What you would need to do in your spider is to yield/return a scrapy.Request to this endpoint and parse the reviews in the callback.

Example, how you can get the review details:

for review in response.css("tr"): 
    review_detail = review.css("td.complaint-detail::text").extract_first()
    print(review_detail)

How I can download the content of a dynamic page?

1 Answers1