0

The Html code snippet is given below. i want some response.xpath(......) to get the link of the pagination. I tried doing

 response.xpath('//*[@class="ui2-pagination-pages"]/a/@href').extract()   

but it doesnot give anything. What am i doing wrong here? Thanks.

<div class="ui2-pagination-pages">
         <a href="javascript:void(0)" class="prev" data-role="prev">Prev</a>
        <a rel="nofollow" href="//www.alibaba.com/showroom/acrylic-wine-box_1.html">1</a>
         <a rel="nofollow" href="//www.alibaba.com/showroom/acrylic-wine-box_2.html">2</a>
        <a rel="nofollow" href="//www.alibaba.com/showroom/acrylic-wine-box_3.html">3</a>
        <span class="current">4</span>
         <a rel="nofollow" href="//www.alibaba.com/showroom/acrylic-wine-box_5.html">5</a>
         <a rel="nofollow" href="//www.alibaba.com/showroom/acrylic-wine-<span class="interim">...</span>
        <a rel="nofollow" href="//www.alibaba.com/showroom/acrylic-wine-box_103.html">103</a>
        <a href="javascript:void(0)" class="next" data-role="next">Next</a>
        </div>

I want to scrap all the paginated links and want to loop through it. How do i do it?

Uchiha AJ
  • 147
  • 3
  • 16
  • 1
    Possible duplicate of [Scraping dynamic content using python-Scrapy](https://stackoverflow.com/questions/30345623/scraping-dynamic-content-using-python-scrapy) – Andersson Aug 19 '18 at 06:33
  • i didn't understand from the link you provided – Uchiha AJ Aug 19 '18 at 07:18
  • 3
    What @Andersson is saying is that the next page link is most likely javascript generated. Try disabling javascript in your browser and loading your webpage, is the url still there? If it's not you need to reverse engineer how the page is making the url and replicate it, see the related question for that. – Granitosaurus Aug 19 '18 at 10:43

1 Answers1

1

This problem is happening because the website that you want to scrape uses Javascript to render it's content and basically spiders can't execute javascript code they don't have an engine that can really interpret it and for that purpose the ScrapingHub team created a plugin called scrapy splash.

You can check it from their official github page: Github

RastaCode
  • 73
  • 11