53
r = requests.get('http://techtv.mit.edu/videos/1585-music-session-02/download.source') 
for i in r.history:
    print(i.url) 

I think it should print out the history, but it doesn't, the above url points to a video, but I cannot get it, anyone help? Thank you

1a1a11a
  • 1,187
  • 2
  • 16
  • 25

1 Answers1

98

To get the resultant URL after you've been redirected, you can do r.url.

r = requests.get('https://youtu.be/dQw4w9WgXcQ') 
print(r.url) # https://www.youtube.com/watch?v=dQw4w9WgXcQ&feature=youtu.be

r.history is for URLs prior to the final one, so it's only returning your original URL because you were only redirected once.

Turun
  • 748
  • 6
  • 7
Aaron Christiansen
  • 11,584
  • 5
  • 52
  • 78
  • 1
    it really helps me with google news scrapping – Rajesh Wolf Aug 12 '18 at 13:50
  • failed for this "https://www.google.com/url?rct=j&sa=t&url=http://news.cmlviz.com/2018/11/12/the-repeating-pattern-in-wipro-limited-that-triggers-right-after-an-earnings-beat-and-the-option-trade-thatfollows.html&ct=ga&cd=CAIyGmMyMzNjMDg2NDBlY2NhMDE6Y29tOmVuOlVT&usg=AFQjCNF14hJZaV0rXun1oZV4RJlOVF6eaA" – brainLoop Nov 13 '18 at 10:42
  • 8
    Correct me if I am wrong, wouldn't this make a request to the redirected URL unnecessarily (waste bandwidth and time) when we only want the redirect URL string without actually making a request to the redirect URL? – ritiek Apr 07 '19 at 08:47
  • 3
    @ritiek It won't actually read any data from the socket so it won't waste bandwidth (beyond reading the header). You could do `requests.head` instead to do a HEAD request, but this doesn't always work, as it depends on if the server is set up properly. – Artyer Apr 12 '19 at 19:42
  • 15
    This doesn't work for me, it just prints out the original MIT link. Has something in `requests` changed in the last 3 years ... ? – Max von Hippel Aug 14 '19 at 23:27
  • @MaxvonHippel Does it not work at all, or only on the example MIT link? The MIT link relies on JavaScript to do anything now and therefore doesn't redirect when used with requests. It still works with links that redirect via the header/HTTP status code 30X. I'll edit the answer with a working link. – Turun Feb 07 '21 at 11:47
  • @TurunAmbartanen I'm really sorry but I don't remember anymore .. maybe one of the people who up-voted my comment can explain? (Sorry to be so unhelpful!) – Max von Hippel Feb 07 '21 at 21:58
  • IT STILLS WORKING. Thank you so much! – DANIEL ROSAS PEREZ Dec 21 '21 at 17:18
  • check my solution using webbrowser library here. (https://stackoverflow.com/questions/62503861/to-get-redirected-url-with-requests/70869177#70869177) – Shahin Shirazi Jan 26 '22 at 19:39
  • @ritiek You're right, using this method always "wastes" one extra request but is needed to check for further redirections. If you know the exact number of redirects make the first request then do ```r.next.url``` to get redirect URL. If needed create a session and send ```r.next``` n-1 times. – freshpasta Mar 08 '22 at 10:31