How do you access the 101st page of an amazon category list

Question

I would like to access all of the items in a given category inside amazon, but it seems that the category pages are generated via search. Bumping the page search parameter in the URL will only take you to the 100th page. Is there any way to get past that? Here's a sample url for books

score 1 · Accepted Answer · edited May 23 '17 at 12:07

1

The content is loaded dynamically using ajax XHR call.

Long story short:

open browser dev tools
open network tab
click on the page link on amazon
see XHR request is going to http://www.amazon.com/mn/search/ajax/ref=sr_pg_3... - this is what you should call in your Scrapy spider (returns JSON)

So, basically, you should just call this XHR request 100 times (or find out if you can get them all in one).

Useful links:

Notes:

amazon limits search results to 100 pages
you can try amazon API instead of scraping web-site directly. See Amazon API library for Python?.

Hope that helps.

edited May 23 '17 at 12:07

Community

1
1

answered Apr 24 '13 at 13:53

alecxe

462,703
120
1,088
1,195

thanks for the tip, that was helpful. Taking a look at those two links you shared. As for the xhr request, it looks pretty nasty, as the JSON results actually contain the page's HTML. I try bumping up the two parameters page=101 and ref=sr_pg_100, but results are then empty. Any idea what the rest of parameters are for? – Andres Apr 24 '13 at 23:55
It's smth specific to this ajax dataprovider, you probably need just `page`, and may be `sort`. I've added some notes to the answer, see if it helps. – alecxe Apr 25 '13 at 08:26
haven't looked at it in a while. Do you have anything? – Andres Oct 28 '14 at 20:49

How do you access the 101st page of an amazon category list

1 Answers1