-1

I have a little issue, because when I want to crawle a site i get error like: "HTTP Error 404 not found" I tried some ways to fix it, but it didn't work. I can't connect with the site to get the data.

from bs4 import BeautifulSoup as soup
from urllib.request import urlopen as uReq
import urllib.request

my_url="https://tabletennis.setkacup.com/en/schedule?date=2021-08-29&hall=4&period=1"
user_agent = 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Firefox/91.0'
headers={'User-Agent':user_agent,}

request = urllib.request.Request(my_url,None,headers)
uClient = uReq(request)
Marcello Romani
  • 2,967
  • 31
  • 40
JimmyFrutado
  • 43
  • 1
  • 1
  • 5

1 Answers1

1

It seems like SSL Error if so look at here.

Or you can try requests library.

pip install requests

import requests

my_url="https://tabletennis.setkacup.com/en/schedule?date=2021-08-29&hall=4&period=1"
user_agent = 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Firefox/91.0'
headers={'User-Agent':user_agent,}

response = requests.request(method="GET", url=my_url, headers=headers)
print(response.content)

MertG
  • 753
  • 1
  • 6
  • 22
  • That definitely works. But the body is just a reference to a javascript file ` ` This SO question seems relevant https://stackoverflow.com/questions/16157719/how-to-follow-a-redirect-with-urllib – Marcello Romani Aug 29 '21 at 12:43