Scrapy: next button uses WebForm_DoPostBackWithOptions()

Question

I am trying to scrape some information from https://seminovos.localiza.com/Paginas/resultado-busca.aspx?&yr=2014_2019&pc=25000_500000

In this webpage, next_page button has a href with the following: 'javascript:WebForm_DoPostBackWithOptions(new WebForm_PostBackOptions("ctl00$ctl42$g_f221d036_75d3_4ee2_893d_0d7b40180245$ProximaPaginaSuperior", "", true, "", "", false, true))

I could do that easily with Selenium, but using scrapy, how can I go to the next page?

I tried something like:

next_page = response.xpath('.//a[@class="item option next"]/@href').extract_first()

if next_page:
    self.log(next_page)
    scrapy.http.FormRequest(response.url,formdata={"eventTarget":"ctl00$ctl42$g_f221d036_75d3_4ee2_893d_0d7b40180245$ProximaPaginaSuperior","eventArgument":"","validation":"true","validationGroup":"","actionUrl":"","trackFocus":"false","clientSubmit":"true"},callback=self.parse)

What is the proper way to navigate to next page on this situation?

score 1 · Answer 1 · answered Apr 13 '19 at 11:08

Find out the details of the request that your web browser performs when you click that, and try to reproduce it based on the available data.

The answers to Can scrapy be used to scrape dynamic content from websites that are using AJAX? should give you an idea of ways to approach this. There is also a pull request for the Scrapy documentation that covers dealing with this type of scenarios, which you might find useful.

score 0 · Accepted Answer · answered Apr 15 '19 at 13:58

It uses ASP.NET, so searching a lot more and analyzing the page I found what I was looking for:

the final code has this format:

if next_page:    
    yield FormRequest.from_response(response,formdata={'__EVENTTARGET':'ctl00$ctl42$g_f221d036_75d3_4ee2_893d_0d7b40180245$ProximaPagina'},callback=self.parse,dont_click=True)

It worked now. Thanks.

Scrapy: next button uses WebForm_DoPostBackWithOptions()

2 Answers2