0

I was playing with some data from the Minor League Cricket site and wrote a quick script to grab some basic player data and assign it to columns.

Starting with the df

    Player
0   Liam Plunkett
1   Jonathan Foo
2   Milind Kumar
3   Carmi Le Roux
4   Derick Narine

I then wrote the following quick script to grab the data

def getPlayerBasicInfo(row):
    print('running function')
    data = requests.post("https://api.cricclubs.com/MiLC/searchPlayer.do",data={'firstName':row['Player']}) #make request
    elements = soupify(data.text) 
    try: #if player exists then set player data else return nones
        table = elements.find(id="playersData").find('tbody').find('tr')
        found = True
    except:
        found = False
        playerURL = None
        Role = None
        Team = None
    if found == True:
        playerURL = table.find_all('th')[1].find('a')['href'].strip()
        Role = table.find_all('th')[2].text.strip()
        if table.find_all('td')[1].text.strip() != "":
            Team = table.find_all('td')[1].text.strip()
        else:
            Team = 'Blank'
    toReturn = [playerURL,Role,Team] #variables to return
    print(f"For {row['Player']} sending {len(toReturn)} items :{toReturn}")
    return toReturn

testdf[['URL','Role','Team']] = testdf.apply(getPlayerBasicInfo,axis='columns')

testdf

This the returns the following:

running
For Liam Plunkett sending 3 items :['/MiLC/viewPlayer.do?playerId=2712978&clubId=18036', 'All Rounder', 'Blank']
running
For Jonathan Foo sending 3 items :['/MiLC/viewPlayer.do?playerId=2179460&clubId=18036', 'Batsman', 'The Philadelphians']
running
For Milind Kumar sending 3 items :['/MiLC/viewPlayer.do?playerId=2720813&clubId=18036', 'Batsman', 'The Philadelphians']
running
For Carmi Le Roux sending 3 items :['/MiLC/viewPlayer.do?playerId=2611097&clubId=18036', 'All Rounder', 'East Bay Blazers']
running
For Derick Narine sending 3 items :['/MiLC/viewPlayer.do?playerId=2175608&clubId=18036', 'Batsman', 'The Philadelphians']
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Untitled-1.ipynb Cell 4 in <cell line: 36>()
     33     print(f"For {row['Player']} sending {len(toReturn)} items :{toReturn}")
     34     return toReturn
---> 36 testdf[['URL','Role','Team']] = testdf.apply(getPlayerBasicInfo,axis='columns')
     38 testdf
[SOME OTHER ERROR LINES]
ValueError: Columns must be same length as key

My understanding is that the Columns must be same length as key error occurs when the returned number of columns isn't the same number as columns I'm assigning, but I'm unsure how that isn't the case here. The script says it's returning three columns and done so each time yet it still produces an error at the end?

Thanks for any help, I'm still getting used to Pandas.

gdhp
  • 31
  • 1
  • 1
    Does this answer your question? [Return multiple columns from pandas apply()](https://stackoverflow.com/questions/23586510/return-multiple-columns-from-pandas-apply) You can try returning a Series or setting `result_type="expand"`. – Shaido Aug 17 '22 at 06:44
  • Hi. Thanks. This worked with the expand method. Though it also made me realize one other quirk of Pandas is that it ran the function for all rows before trying to actually set the values, which is why it appeared to fun fine until the last entry, even though it was actually failing on all of them. – gdhp Aug 17 '22 at 15:08

0 Answers0