How to get link from the javascript event handler in Xpath?

Question

The Html code snippet is given below. i want some response.xpath(......) to get the link of the pagination. I tried doing

 response.xpath('//*[@class="ui2-pagination-pages"]/a/@href').extract()

but it doesnot give anything. What am i doing wrong here? Thanks.

<div class="ui2-pagination-pages">
         <a href="javascript:void(0)" class="prev" data-role="prev">Prev</a>
        <a rel="nofollow" href="//www.alibaba.com/showroom/acrylic-wine-box_1.html">1</a>
         <a rel="nofollow" href="//www.alibaba.com/showroom/acrylic-wine-box_2.html">2</a>
        <a rel="nofollow" href="//www.alibaba.com/showroom/acrylic-wine-box_3.html">3</a>
        <span class="current">4</span>
         <a rel="nofollow" href="//www.alibaba.com/showroom/acrylic-wine-box_5.html">5</a>
         <a rel="nofollow" href="//www.alibaba.com/showroom/acrylic-wine-<span class="interim">...</span>
        <a rel="nofollow" href="//www.alibaba.com/showroom/acrylic-wine-box_103.html">103</a>
        <a href="javascript:void(0)" class="next" data-role="next">Next</a>
        </div>

I want to scrap all the paginated links and want to loop through it. How do i do it?

Possible duplicate of [Scraping dynamic content using python-Scrapy](https://stackoverflow.com/questions/30345623/scraping-dynamic-content-using-python-scrapy) — Andersson, Aug 19 '18 at 06:33
What @Andersson is saying is that the next page link is most likely javascript generated. Try disabling javascript in your browser and loading your webpage, is the url still there? If it's not you need to reverse engineer how the page is making the url and replicate it, see the related question for that. — Granitosaurus, Aug 19 '18 at 10:43

score 1 · Accepted Answer · answered Aug 25 '18 at 12:08

This problem is happening because the website that you want to scrape uses Javascript to render it's content and basically spiders can't execute javascript code they don't have an engine that can really interpret it and for that purpose the ScrapingHub team created a plugin called scrapy splash.

You can check it from their official github page: Github

How to get link from the javascript event handler in Xpath?

1 Answers1