Given 3 raw and old urls:
raw_url = 'https://digi.kansalliskirjasto.fi/aikakausi/binding/498491?term=1864&term=SUOMI&language=fi'
raw_url = 'https://twitter.com/i/user/2274951674'
raw_url = 'https://youtu.be/dQw4w9WgXcQ'
Using this code snippet in my Linux machine (different question than this post) containing both GET
and HEAD
methods in requests
library to obtain the updated urls:
#r = requests.get(raw_url)
r = requests.head(raw_url, allow_redirects=True)
r.raise_for_status()
print(f"HTTP family: {r.status_code}\tExists: {r.ok}\thistory:{r.history}")
updated_url = r.url
print(f"Updated URL: {updated_url}") # works only for 3rd raw_url
It seems that it only redirects and updates those urls with <Response [3XX]> (my third raw_url
) not others.
My updated and expected urls in a web browser are:
https://digi.kansalliskirjasto.fi/aikakausi/binding/498491?term=1864&term=SUOMI&page=1
https://twitter.com/ozanbayram01
https://www.youtube.com/watch?v=dQw4w9WgXcQ # still different from requests updated url
How can I get updated urls in python in such scenarios?
Cheers,