0

I'm extracting the data from this reseller site for cars, but I can not find a way to iterate over the pages. I usually iterate by altering some index present in the url, but in the url of that site there is no index of any page

Here is an example code of how I usually do when I can iterate the pages by editing the url:

import requests as req

url = "https://www.seminovosunidas.com.br/veiculos/page:{}?utm_source=afilio&utm_medium=display&utm_campaign=maio&utm_content=ron_ambos&utm_term=120x600_promocaomaio_performance_-_-"
indice_pagina = 1
dados = {}
r = req.get(url.format(indice_pagina))
print(r.text)

1 Answers1

0

I think you are new to scraping. There are links in each div you can find it at this path and iterate for more pages

#resultadoPesquisa > div:nth-child(1) > a

and get the herf attribute that has the link like

/Paginas/detalhes-do-carro.aspx?o=fmKOUbLvWxA%3d

which you can append to url to request for the product

so This would be like this

complete_url = 'https://seminovos.localiza.com' + '/Paginas/detalhes-do-carro.aspx?o=fmKOUbLvWxA%3d'

comment if you have any question

Usama Jamil
  • 68
  • 10
  • This way I enter the specific url of a car, but what I need is to iterate over the search pages – Rafael Ribeiro May 20 '18 at 00:01
  • Yes that can be done by calling the appropriate javascript function in this case it is javascript:WebForm_DoPostBackWithOptions(new WebForm_PostBackOptions("ctl00$ctl42$g_f221d036_75d3_4ee2_893d_0d7b40180245$PaginacaoCarrosSuperior$ctl01$NumeroPagina", "", true, "", "", false, true)) – Usama Jamil May 20 '18 at 00:06
  • well, i dont know javascript. My area is data science, not web development. Could you explain this better? – Rafael Ribeiro May 20 '18 at 04:20
  • you can follow this link https://stackoverflow.com/questions/8284765/how-do-i-call-a-javascript-function-from-python if you don't understand it you can move on to python selenium in that can you won't need to load javascript it will be automatically loaded – Usama Jamil May 20 '18 at 04:23