-1

Here is my code to get the redirect URL. It's for education purpose. I feel like the request is detected as bot, so the website turns out the reCaptcha, though I have used fake User Agent and Proxy. Instead of getting a different link, I've got the same url before I use requests.get. Any idea for how to solve it?

import requests
from fake_useragent import UserAgent
ua = UserAgent()
hdr = {'User-Agent': ua.random,
      'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
      'Accept-Charset': 'ISO-8859-1,utf-8;q=0.7,*;q=0.3',
      'Accept-Encoding': 'none',
      'Accept-Language': 'en-US,en;q=0.8',
      'Connection': 'keep-alive'}
PROXY = {"http":"http://X.X.X.X:YYYY"}
url = "https://avxhm.se/go/6074475/0/"
response = requests.get(url, allow_redirects=True, headers=hdr, proxies = PROXY)
print(response.url)
Thanh Huy Le
  • 89
  • 1
  • 8
  • 1
    Does this answer your question? [Python Requests library redirect new url](https://stackoverflow.com/questions/20475552/python-requests-library-redirect-new-url) – will-hedges Apr 21 '21 at 20:36
  • Sorry, it's not the answer. I try to find a way to bypass the bot detection. My code runs well with other sites, not this specific site. – Thanh Huy Le Apr 21 '21 at 20:41

1 Answers1

0

one trick is to wrap your call with requests.Session(). So when request with header does not work, session get handy!

import requests


url = 'https://avxhm.se/go/6074475/0/'

user_agent = {'User-agent': '14.0.3 Safari'}

session = requests.Session()
r1 = session.get(url, headers=user_agent)
print(r1.url)
simpleApp
  • 2,885
  • 2
  • 10
  • 19
  • I wonder if you have tested it or not? It does not work with me. Honestly, I have also tried with session.get already, without any change in the outcome. For example, when you run this code for 10 times, and your IP is in blacklist, so they will turn on reCaptcha. I aim to a snippet that can get the final URL via proxy all the time without any reCaptcha. – Thanh Huy Le Apr 21 '21 at 20:46
  • for sure, it gives me "https://icerbox.com/l8R23pmO/B0882YW2CW.epub" – simpleApp Apr 21 '21 at 20:47
  • Please try to run it for about 10 or 20 times, and after that, if you don't have proxy, you will get stuck forever. Even you have proxy, you still get stuck. – Thanh Huy Le Apr 21 '21 at 20:49
  • If I use Selenium to simulate the browser, it works well. But Selenium has the very bad performance. That's why I tried to change to the requests get approach. – Thanh Huy Le Apr 21 '21 at 20:53