0

I am new at using scrapy and python I wanted to start scraping data from a search result, if you will load the page the default content will appear, what I need to scrape is the filtered one, while doing pagination?

Here's the URL https://teslamotorsclub.com/tmc/post-ratings/6/posts I need to scrape the item from Time Filter: "Today" result

I tried different approach but none is working.

What I have done is this but more on layout structure.

class TmcnfSpider(scrapy.Spider):
name = 'tmcnf'
allowed_domains = ['teslamotorsclub.com']
start_urls = ['https://teslamotorsclub.com/tmc/post-ratings/6/posts']

def start_requests(self):
    #Show form from a filtered search result

def parse(self, response):

    #some code scraping item

#Yield url for pagination

1 Answers1

0

To get the posts of todays filter, you need to send a post request to this url https://teslamotorsclub.com/tmc/post-ratings/6/posts along with payload. The following should fetch you the results you are interested in.

import scrapy

class TmcnfSpider(scrapy.Spider):
    name = "teslamotorsclub"
    start_urls = ["https://teslamotorsclub.com/tmc/post-ratings/6/posts"]

    def parse(self,response):
        payload = {'time_chooser':'4','_xfToken':''}
        yield scrapy.FormRequest(response.url,formdata=payload,callback=self.parse_results)

    def parse_results(self,response):
        for items in response.css("h3.title > a::text").getall():
            yield {"title":items.strip()}
SIM
  • 21,997
  • 5
  • 37
  • 109