1

Got a DataFrame, filled with different types: string, int, list I'm now trying to convert every element of each column, that contains a list which isn't longer than one element, to string. I am very well aware of the method apply in connection with a lambda function, yet I don't seem to fully grasp it... Data has been imported from json file with json.load(XXX) and split into different DataFrames also using json_normalize.

DataFrame "infos":
name     town     link                             number
Bob      NY       ["https://www.bobsite.com"]      00184747328859
Alice    MI       ["https://www.alicesite.com"]    00198309458093

Python code:

infos = infos.apply(lambda x: x[0])

# or if just accessing one column
infos = infos.link.apply(lambda x: x[0])

In general doesn't seem the right way to handle this.

Would expect this to be the new DataFrame:

DataFrame "infos":
name     town     link                          number
Bob      NY       https://www.bobsite.com       00184747328859
Alice    MI       https://www.alicesite.com     00198309458093
Valentino
  • 7,291
  • 6
  • 18
  • 34
  • 1
    Possible duplicate of [Apply function to each cell in DataFrame](https://stackoverflow.com/questions/39475978/apply-function-to-each-cell-in-dataframe) – Valentino Jul 29 '19 at 10:56

2 Answers2

0

Looks like you need df.applymap with a custom finction.

Ex:

df = pd.DataFrame({'name':["Bob", "Alice"],
                   'town': ["NY", "MI"],
                   'link': [["https://www.bobsite.com"], ["https://www.alicesite.com"]],
                   'number': ["00184747328859", "00198309458093"]})


def cust_func(row):
    if isinstance(row, list):   #Check if object is list
        if len(row) == 1:       #Check if list size == 1
            return row[0]
    return row 


df = df.applymap(cust_func) 
print(df)

Output:

                        link   name          number town
0    https://www.bobsite.com    Bob  00184747328859   NY
1  https://www.alicesite.com  Alice  00198309458093   MI

If it is just one columns and it has only one value in list use .str[0]

df = pd.DataFrame({'name':["Bob", "Alice"],
                   'town': ["NY", "MI"],
                   'link': [["https://www.bobsite.com"], ["https://www.alicesite.com"]],
                   'number': ["00184747328859", "00198309458093"]})


df["link"] = df["link"].str[0]
print(df)
Rakesh
  • 81,458
  • 17
  • 76
  • 113
0

use eval if list are strings else you can omit

out['link'] = out.link.apply(eval).apply(' '.join)

Output

    name town                       link        number
0    Bob   NY    https://www.bobsite.com  184747328859
1  Alice   MI  https://www.alicesite.com  198309458093
iamklaus
  • 3,720
  • 2
  • 12
  • 21