Python requests.get not returning all elements from website

Question

I am trying to get the fixtures of the players from this website but when I use requests.get, it returns none.

r = requests.get("http://www.fplstatistics.co.uk/")
soup = BeautifulSoup(compiled.sub("",r.text),'lxml')
allFixtures = soup.find("span", {"class": "dtr-data"})
return allFixtures

1. I found the tag using the text you provided but there is no _class_. 2. What data exactly do you want? the entire table or just the Names? — MendelG, Oct 19 '21 at 21:04
Just the names of the fixtures. I want it to return "Leicester(H) Burnley(A) Norwich(H) Newcastle(A)" — louis besson, Oct 19 '21 at 21:11
I find the site being rendered dynamically and might require `selenium` or api endpoints. — kite, Oct 20 '21 at 06:58

Martin Evans · Answer 1 · 2021-10-20T14:29:32.167

The information you need is not contained in the HTML returned from your URL. The browser constructs another call to get this via javascript (which requests does not support).

By observing using your browser's developer tools you can see the request being made to get the data returned as JSON.

The URL it uses to get this unfortunately needs some information which is buried inside one of the script sections inside the HTML. The key and value needed both are using HEX format (if you search the HTML, you will find it).

A regular expression can be used to extract the key and value needed to make the call. With this, a second requests call can be made to get the JSON (the same way a browser would). I suggest you print this out so you can see the structure of all the information that is returned.

The following should work:

import requests
import re

headers = {"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/94.0.4606.81 Safari/537.36"}
s = requests.Session()
req_main = s.get("http://www.fplstatistics.co.uk/", headers=headers)

k = re.search(r'"\\x6E\\x61\\x6D\\x65":"(.*?)"', req_main.text).group(1)
v = re.search(r'"\\x76\\x61\\x6C\\x75\\x65":(.*?)}', req_main.text).group(1)

url_json = f"http://www.fplstatistics.co.uk/Home/AjaxPricesIHandler?{k}={v}&pyseltype=0"
req_json = s.get(url_json, headers=headers)
fixtures = [fixture[-1] for fixture in req_json.json()["aaData"]]

for fixture in fixtures:
    print(fixture)

Giving you output starting:

Aston Villa(H) Leicester(A) Watford(H) Liverpool(A) 
Aston Villa(H) Leicester(A) Watford(H) Liverpool(A) 
Aston Villa(H) Leicester(A) Watford(H) Liverpool(A)

Python requests.get not returning all elements from website

1 Answers1