0
import requests
x = requests.get("https://www.pap.fr/annonce/annonce-vente-france-g25-23")
print(x.url)

This url is redirected to "https://www.pap.fr/annonce/vente-immobiliere-france-g25", but the response url x always shows "https://www.pap.fr/annonce/annonce-vente-france-g25-23".

The request.history method doesn't work to, I tried to run this code on the url:

if response.history:
    print("Request was redirected")
    for resp in response.history:
        print(resp.status_code, resp.url)
    print("Final destination:")
    print(response.status_code, response.url)
else:
    print("Request was not redirected")

But it always shows the sent url...

1 Answers1

0

This is because the request from your browser (the one you observed & initially noticed the redirect to the new URL) is treated completely differently than the request from Python. From the browser, the URL returns a 301 redirect as expected:

301 redirect in Chrome browser

However a GET to the same URL in Python with no additional configuration, the server returns a 403 Forbidden status code:

import requests
x = requests.get("https://www.pap.fr/annonce/annonce-vente-france-g25-23")
print(x.url)
print(x.status_code) # 403

Live example on Repl.it.

It's particularly difficult to say why specifically the server is returning this Forbidden code, but it may have something to do with the fact that there are a significant amount of headers sent in the browser's request that you aren't sending in your Python request, among others:

enter image description here

The only way to definitively know why this is occurring would be to review the way the server is implemented in the back-end, which it doesn't seem like you'd have access to. Your best bet is probably to emulate the headers your browser sends in your Python request as closely as possible and go from there.

esqew
  • 42,425
  • 27
  • 92
  • 132