2

So i want to web scrape one site, but when i iterate over the results pages after few request(about 30 max.) requests.get throws this error:

requests.exceptions.TooManyRedirects: Exceeded 30 redirects

The search url gets redirected to the main page url and each next url acts the same until I connect to different VPN. Even when i am spoofing user agent and rotating proxies from a list of free proxies it still gets redirected after few requests. I have never had a problem during web scraping like that before. What is the best way to bypass this "redirect block"? allow_redirects=False doesn't work here too.

import requests
import random
import time

agents = [...] # List of user agents

for i in range(1,100):
    url = "https://panoramafirm.pl/odpady/firmy,{}.html".format(i)
    r = requests.get(url, headers={"User-Agent": random.choice(agents)})
    print(r.status_code)
    time.sleep(random.randint(10,15))

1 Answers1

0

Since you are using requests you could make use of allow_redirects=False option.

game0ver
  • 1,250
  • 9
  • 22
  • I tried it but it didn't help. After about 20 requests it starts redirecting back to home page status code is 302 for the redirected ones. – druidmaciek Mar 17 '18 at 23:26
  • For some reason I get `This page isn’t working panoramafirm.pl redirected you too many times` error with even visiting the page - so I can't check right now - I'll try later and if I find a solution I'll post it! – game0ver Mar 18 '18 at 12:56
  • That would be great. If you have VPN you could try accesing with one located in Poland(that worked for me when i first saw this page isn't working) – druidmaciek Mar 18 '18 at 22:43