-1

Is it possible to create a proxy failover within Scrapy, so that when one fails the other will take over scraping the rest of the requests? I would of thought that it would be done using the retry middleware, but I don't really have a clue how to create one.

I have found a few examples online but none using two API SDK proxies. From the research i have done i think it can be done using retry middleware, but i dont think i quite understand how to do it.

FYI - The proxies im using are Zyte proxy manager and Scrapeops.io

  • Can You clarify the meaning of _using two API SDK proxies_ ? - are You aimed to use.. multiple proxy providers on single job? and.. disable "bad" ones during runtime? – Georgiy Mar 06 '23 at 11:47

1 Answers1

0

That is what middlewares are.

YOUR REQUEST -> RETRY_MIDDLEWARE -> CUSTOM_HEADER_MIDDLEWARE -> SERVER

So, when you make a request, it should pass through all middlewares before being sent to the server.

They can, however, do whatever they want; for example, RETRY MIDDLEWARE can explicitly return and not pass the request to the next middleware.

So you'll need to write your own middleware that will change the proxy if it fails.

Karen Petrosyan
  • 372
  • 2
  • 7