0

I am trying to get the fixtures of the players from this website but when I use requests.get, it returns none.

r = requests.get("http://www.fplstatistics.co.uk/")
soup = BeautifulSoup(compiled.sub("",r.text),'lxml')
allFixtures = soup.find("span", {"class": "dtr-data"})
return allFixtures

1 Answers1

0

The information you need is not contained in the HTML returned from your URL. The browser constructs another call to get this via javascript (which requests does not support).

By observing using your browser's developer tools you can see the request being made to get the data returned as JSON.

The URL it uses to get this unfortunately needs some information which is buried inside one of the script sections inside the HTML. The key and value needed both are using HEX format (if you search the HTML, you will find it).

A regular expression can be used to extract the key and value needed to make the call. With this, a second requests call can be made to get the JSON (the same way a browser would). I suggest you print this out so you can see the structure of all the information that is returned.

The following should work:

import requests
import re

headers = {"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/94.0.4606.81 Safari/537.36"}
s = requests.Session()
req_main = s.get("http://www.fplstatistics.co.uk/", headers=headers)

k = re.search(r'"\\x6E\\x61\\x6D\\x65":"(.*?)"', req_main.text).group(1)
v = re.search(r'"\\x76\\x61\\x6C\\x75\\x65":(.*?)}', req_main.text).group(1)

url_json = f"http://www.fplstatistics.co.uk/Home/AjaxPricesIHandler?{k}={v}&pyseltype=0"
req_json = s.get(url_json, headers=headers)
fixtures = [fixture[-1] for fixture in req_json.json()["aaData"]]

for fixture in fixtures:
    print(fixture)

Giving you output starting:

Aston Villa(H) Leicester(A) Watford(H) Liverpool(A) 
Aston Villa(H) Leicester(A) Watford(H) Liverpool(A) 
Aston Villa(H) Leicester(A) Watford(H) Liverpool(A)
Martin Evans
  • 45,791
  • 17
  • 81
  • 97