I needed to parse a html from a site, if I run on a localhost, the scrape works normally, only in the deploy I got an 403 Forbidden and I already tried the user-agent and referer as follow bellow:
Obs: This site and me is from Brazil and my code is deployed with Heroku.
Code:
header = {
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.212 Safari/537.36",
"referer": 'https://www.guichevirtual.com.br'
}
url = 'https://www.guichevirtual.com.br/passagem-de-onibus/campo-grande-ms/sao-paulo-todas-sp'
r = requests.get(url, headers=header)
print(r.text)
output:
<html>
<head><title>403 Forbidden</title></head>
<body bgcolor="white">
<center><h1>403 Forbidden</h1></center>
</body>
</html>
if there is an english error, sorry. I'm learning.