0

I am learning how to use json module in python from here: http://www.fantasyfutopia.com/python-for-fantasy-football-apis-and-json-data/

When I am running the exact copy paste code (only changing the url), I am getting this error:

import pandas as pd
import json
import requests
from pandas.io.json import json_normalize


# Define a function to get info from the FPL API and save to the specified file_path
# It might be a good idea to navigate to the link in a browser to get an idea of what the data looks like
def get_json(file_path):
    response = requests.get("https://understat.com/league/EPL")
    jsonResponse = response.json()
   
    with open(file_path, 'w') as outfile:
        json.dump(jsonResponse, outfile)
        
# Run the function and choose where to save the json file
get_json('C:\\Python Advanced/python-for-fantasy-football-master\\python-for-fantasy-football-master\\4 - APIs and JSON Data\\EPL_team.json')
 
# Open the json file and print a list of the keys
with open('C:\\Python Advanced/python-for-fantasy-football-master\\python-for-fantasy-football-master\\4 - APIs and JSON Data\\EPL_team.json') as json_data:
    data = json.loads(json_data)
    print(list(data.keys()))
---------------------------------------------------------------------------
JSONDecodeError                           Traceback (most recent call last)
<ipython-input-29-6322b86c0ee6> in <module>
     15 
     16 # Run the function and choose where to save the json file
---> 17 get_json('C:\\Python Advanced/python-for-fantasy-football-master\\python-for-fantasy-football-master\\4 - APIs and JSON Data\\EPL_team.json')
     18 
     19 # Open the json file and print a list of the keys

<ipython-input-29-6322b86c0ee6> in get_json(file_path)
      9 def get_json(file_path):
     10     response = requests.get("https://understat.com/league/EPL")
---> 11     jsonResponse = response.json()
     12 
     13     with open(file_path, 'w') as outfile:

C:\ProgramData\Anaconda3\lib\site-packages\requests\models.py in json(self, **kwargs)
    896                     # used.
    897                     pass
--> 898         return complexjson.loads(self.text, **kwargs)
    899 
    900     @property

C:\ProgramData\Anaconda3\lib\json\__init__.py in loads(s, cls, object_hook, parse_float, parse_int, parse_constant, object_pairs_hook, **kw)
    355             parse_int is None and parse_float is None and
    356             parse_constant is None and object_pairs_hook is None and not kw):
--> 357         return _default_decoder.decode(s)
    358     if cls is None:
    359         cls = JSONDecoder

C:\ProgramData\Anaconda3\lib\json\decoder.py in decode(self, s, _w)
    335 
    336         """
--> 337         obj, end = self.raw_decode(s, idx=_w(s, 0).end())
    338         end = _w(s, end).end()
    339         if end != len(s):

C:\ProgramData\Anaconda3\lib\json\decoder.py in raw_decode(self, s, idx)
    353             obj, end = self.scan_once(s, idx)
    354         except StopIteration as err:
--> 355             raise JSONDecodeError("Expecting value", s, err.value) from None
    356         return obj, end

JSONDecodeError: Expecting value: line 1 column 1 (char 0)

I am still a newbie with json. My aim is to get TEAM data from the url. Kindly help.

  • maybe trying printing `r.content` to see what your requests is actually returning – Chris Dec 04 '20 at 21:21
  • Look at the data, Luke. Clearly the content you are receiving _isn’t_ JSON - so examine the file content, it probably contains a message from the website.. – DisappointedByUnaccountableMod Dec 04 '20 at 21:22
  • The content i am receiving is HTML. So should I be using BeautifulSoup. But the data I want is "var datesData = JSON.parse('\x5B\x7B\x22id\x22\x3........" Any idea how I should go about it? – Tales_of_SS Dec 04 '20 at 21:34
  • I think I solved the issue by using this code ``` response = requests.get("https://understat.com/league/EPL") playersData = re.search("playersData\s+=\s+JSON.parse\('([^']+)", response.text) #print(response.content) #xg_soup = BeautifulSoup(driver.page_source, 'lxml') decoded_string = bytes(playersData.groups()[0], 'utf-8').decode('unicode_escape') playerObj = json.loads(decoded_string) print(playerObj) ``` This gives in all the data which I have to format into a table. – Tales_of_SS Dec 04 '20 at 21:51

1 Answers1

0

Found the solution here : Issue with scraping Understat chart data using Selenium

import pandas as pd
import json
import requests
from pandas.io.json import json_normalize
import re

# Understat use data inside a JSON.parse() function via the below coode to get data of players in a panda dataframe:
        
response = requests.get("https://understat.com/league/EPL") #to read the html
    
playersData = re.search("playersData\s+=\s+JSON.parse\('([^']+)", response.text) #to read player data    
decoded_string = bytes(playersData.groups()[0], 'utf-8').decode('unicode_escape') #decoding the string
playerObj = json.loads(decoded_string) #json.loads() for decoding. It takes in a string and returns a JSON object


playersDataDF = pd.json_normalize(playerObj) #json_normalize() is used for flatting the object into a pandas dataframe

#print(playersDataDF)
playersDataDF