38

I am trying to capture http status code 3XX/302 for a redirection url. But I cannot get it because it gives 200 status code.

Here is the code:

import requests
r = requests.get('http://goo.gl/NZek5')
print r.status_code

I suppose this should issue either 301 or 302 because it redirects to another page. I had tried few redirecting urls (for e.g. http://fb.com ) but again it is issuing the 200. What should be done to capture the redirection code properly?

Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
Bishwash
  • 854
  • 1
  • 9
  • 22

4 Answers4

79

requests handles redirects for you, see redirection and history.

Set allow_redirects=False if you don't want requests to handle redirections, or you can inspect the redirection responses contained in the r.history list.

Demo:

>>> import requests
>>> url = 'https://httpbin.org/redirect-to'
>>> params = {"status_code": 301, "url": "https://stackoverflow.com/q/22150023"}
>>> r = requests.get(url, params=params)
>>> r.history
[<Response [301]>, <Response [302]>]
>>> r.history[0].status_code
301
>>> r.history[0].headers['Location']
'https://stackoverflow.com/q/22150023'
>>> r.url
'https://stackoverflow.com/questions/22150023/http-redirection-code-3xx-in-python-requests'
>>> r = requests.get(url, params=params, allow_redirects=False)
>>> r.status_code
301
>>> r.url
'https://httpbin.org/redirect-to?status_code=301&url=https%3A%2F%2Fstackoverflow.com%2Fq%2F22150023'

So if allow_redirects is True, the redirects have been followed and the final response returned is the final page after following redirects. If allow_redirects is False, the first response is returned, even if it is a redirect.

Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
  • when we run the request with allow_redirects=False, this means to not allow redirects and page wont go to the redirecting page. So why it shows up 301 instead of 200? – Bishwash Mar 03 '14 at 15:44
  • 1
    @user2789099: Sorry, I am not following you. `301` is the redirect status code. `requests` always first gets the first URL; if that is a 301 redirect and `allow_redirects` is `True`, the response is added to the history list and `requests` makes another GET request to retrieve the new location, and so on. If `allow_redirects` is `False`, the first `301` is returned directly. – Martijn Pieters Mar 03 '14 at 15:47
  • 1
    @user2789099: If `request_redirects` is `True`, what is returned is the final response. So the `200` is because `requests` followed the redirect and fetched the next page too. – Martijn Pieters Mar 03 '14 at 15:48
  • I stumbled upon this while having the same issue using C#'s HttpWebRequest. All I had to do was: `request.AllowAutoRedirect = false;` and now I get the 301 I would expect. – Ben Mar 21 '17 at 13:53
  • @Ben that's... interesting, but C# and Python are quite separate beasts. – Martijn Pieters Mar 21 '17 at 13:55
  • yeah I know, but it's not too surprising that they implemented similar logic in this case: "make a web request, if get 3XX status code, do redirect unless I am told not to". Just thought I would leave it here in case anyone had the same issue as me :-) – Ben Mar 21 '17 at 17:20
  • @Ben: sounds like you were looking for [How to get a redirection response](//stackoverflow.com/q/3836784) then. – Martijn Pieters Mar 21 '17 at 18:03
  • @Martijn You are quite right! I stopped looking as your answer got me where I needed to go, but hopefully the link will help anyone else who ends up here! – Ben Mar 22 '17 at 12:41
11

requests.get allows for an optional keyword argument allow_redirects which defaults to True. Setting allow_redirects to False will disable automatically following redirects, as follows:

In [1]: import requests
In [2]: r = requests.get('http://goo.gl/NZek5', allow_redirects=False)
In [3]: print r.status_code
301
George Bahij
  • 597
  • 2
  • 9
2

This solution will identify the redirect and display the history of redirects, and it will handle common errors. This will ask you for your URL in the console.

import requests

def init():
    console = input("Type the URL: ")
    get_status_code_from_request_url(console)


def get_status_code_from_request_url(url, do_restart=True):
    try:
        r = requests.get(url)
        if len(r.history) < 1:
            print("Status Code: " + str(r.status_code))
        else:
            print("Status Code: 301. Below are the redirects")
            h = r.history
            i = 0
            for resp in h:
                print("  " + str(i) + " - URL " + resp.url + " \n")
                i += 1
        if do_restart:
            init()
    except requests.exceptions.MissingSchema:
        print("You forgot the protocol. http://, https://, ftp://")
    except requests.exceptions.ConnectionError:
        print("Sorry, but I couldn't connect. There was a connection problem.")
    except requests.exceptions.Timeout:
        print("Sorry, but I couldn't connect. I timed out.")
    except requests.exceptions.TooManyRedirects:
        print("There were too many redirects.  I can't count that high.")


init()
Wes
  • 399
  • 5
  • 14
0

Anyone have the php version of this code?

    r = requests.get(url)
    if len(r.history) < 1:
        print("Status Code: " + str(r.status_code))
    else:
        print("Status Code: 301. Below are the redirects")
        h = r.history
        i = 0
        for resp in h:
            print("  " + str(i) + " - URL " + resp.url + " \n")
            i += 1
    if do_restart:
  • 1
    If you have a new question, please ask it by clicking the [Ask Question](https://stackoverflow.com/questions/ask) button. Include a link to this question if it helps provide context. - [From Review](/review/late-answers/31345836) – geanakuch Mar 24 '22 at 10:52