web scraping: Iterate over pages of a site without can edit url with Python and Requests

Question

I'm extracting the data from this reseller site for cars, but I can not find a way to iterate over the pages. I usually iterate by altering some index present in the url, but in the url of that site there is no index of any page

Here is an example code of how I usually do when I can iterate the pages by editing the url:

import requests as req

url = "https://www.seminovosunidas.com.br/veiculos/page:{}?utm_source=afilio&utm_medium=display&utm_campaign=maio&utm_content=ron_ambos&utm_term=120x600_promocaomaio_performance_-_-"
indice_pagina = 1
dados = {}
r = req.get(url.format(indice_pagina))
print(r.text)

score 0 · Accepted Answer · answered May 19 '18 at 23:52

0

I think you are new to scraping. There are links in each div you can find it at this path and iterate for more pages

#resultadoPesquisa > div:nth-child(1) > a

and get the herf attribute that has the link like

/Paginas/detalhes-do-carro.aspx?o=fmKOUbLvWxA%3d

which you can append to url to request for the product

so This would be like this

complete_url = 'https://seminovos.localiza.com' + '/Paginas/detalhes-do-carro.aspx?o=fmKOUbLvWxA%3d'

comment if you have any question

answered May 19 '18 at 23:52

Usama Jamil

68
10

This way I enter the specific url of a car, but what I need is to iterate over the search pages – Rafael Ribeiro May 20 '18 at 00:01
Yes that can be done by calling the appropriate javascript function in this case it is javascript:WebForm_DoPostBackWithOptions(new WebForm_PostBackOptions("ctl00$ctl42$g_f221d036_75d3_4ee2_893d_0d7b40180245$PaginacaoCarrosSuperior$ctl01$NumeroPagina", "", true, "", "", false, true)) – Usama Jamil May 20 '18 at 00:06
well, i dont know javascript. My area is data science, not web development. Could you explain this better? – Rafael Ribeiro May 20 '18 at 04:20
you can follow this link https://stackoverflow.com/questions/8284765/how-do-i-call-a-javascript-function-from-python if you don't understand it you can move on to python selenium in that can you won't need to load javascript it will be automatically loaded – Usama Jamil May 20 '18 at 04:23

web scraping: Iterate over pages of a site without can edit url with Python and Requests

1 Answers1