I was playing with some data from the Minor League Cricket site and wrote a quick script to grab some basic player data and assign it to columns.
Starting with the df
Player
0 Liam Plunkett
1 Jonathan Foo
2 Milind Kumar
3 Carmi Le Roux
4 Derick Narine
I then wrote the following quick script to grab the data
def getPlayerBasicInfo(row):
print('running function')
data = requests.post("https://api.cricclubs.com/MiLC/searchPlayer.do",data={'firstName':row['Player']}) #make request
elements = soupify(data.text)
try: #if player exists then set player data else return nones
table = elements.find(id="playersData").find('tbody').find('tr')
found = True
except:
found = False
playerURL = None
Role = None
Team = None
if found == True:
playerURL = table.find_all('th')[1].find('a')['href'].strip()
Role = table.find_all('th')[2].text.strip()
if table.find_all('td')[1].text.strip() != "":
Team = table.find_all('td')[1].text.strip()
else:
Team = 'Blank'
toReturn = [playerURL,Role,Team] #variables to return
print(f"For {row['Player']} sending {len(toReturn)} items :{toReturn}")
return toReturn
testdf[['URL','Role','Team']] = testdf.apply(getPlayerBasicInfo,axis='columns')
testdf
This the returns the following:
running
For Liam Plunkett sending 3 items :['/MiLC/viewPlayer.do?playerId=2712978&clubId=18036', 'All Rounder', 'Blank']
running
For Jonathan Foo sending 3 items :['/MiLC/viewPlayer.do?playerId=2179460&clubId=18036', 'Batsman', 'The Philadelphians']
running
For Milind Kumar sending 3 items :['/MiLC/viewPlayer.do?playerId=2720813&clubId=18036', 'Batsman', 'The Philadelphians']
running
For Carmi Le Roux sending 3 items :['/MiLC/viewPlayer.do?playerId=2611097&clubId=18036', 'All Rounder', 'East Bay Blazers']
running
For Derick Narine sending 3 items :['/MiLC/viewPlayer.do?playerId=2175608&clubId=18036', 'Batsman', 'The Philadelphians']
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Untitled-1.ipynb Cell 4 in <cell line: 36>()
33 print(f"For {row['Player']} sending {len(toReturn)} items :{toReturn}")
34 return toReturn
---> 36 testdf[['URL','Role','Team']] = testdf.apply(getPlayerBasicInfo,axis='columns')
38 testdf
[SOME OTHER ERROR LINES]
ValueError: Columns must be same length as key
My understanding is that the Columns must be same length as key error occurs when the returned number of columns isn't the same number as columns I'm assigning, but I'm unsure how that isn't the case here. The script says it's returning three columns and done so each time yet it still produces an error at the end?
Thanks for any help, I'm still getting used to Pandas.