I want to scrape this website: https://www.racingpost.com/results for the results.
I already have a crawler that scrapes and follows the links on the results page - but i can not go further back than the 6 or seven days that are displayed on the site. The older results are aviable via the "resultsfinder", which is sadly java script, as are other sources of the older races like the form of the horses.
I already tried to learn to scrape java to get the links, and while it is very interesting, I am wondering if there is not an easier way, as the result page adresses are designed in a very convinient way:
Its simply https://www.racingpost.com/results/ + something like 1990-02-08 or 2021-02-11 or any other date.
So I thought it might be easier to design the spider to scrape to get its links from a loop or predefined list of links.
How could I design a loop that runs through 1990-01-01 up to now in scrapy or is it better to create a predefined list of links for this?