0

I'm trying to parse a data frame column with values that look like the below. This was the result after I ran json_normalize on the original data set. The goal is to get the {'name':'Org Lvl 4'} value so I can then parse to just get the actual 'Org Lvl 4' name.

index org [dtype: Object]
0 [{'name': 'Org Lvl 1'}, {'name': 'Org Lvl 2'}, {'name': 'Org Lvl 3'}, {'name': 'Org Lvl 4'}]

I read that Pandas stores this as a string not a list so I tried what others have suggested i.e. split but I'm getting the following error AttributeError: Can only use .str accessor with string values!

Code:

df['org'] = df['org'].str.split(',').str[3]

DataFrame:

df = pd.DataFrame({'org [dtype: Object]': {0: "[{'name': 'Org Lvl 1'}, {'name': 'Org Lvl 2'}, {'name': 'Org Lvl 3'}, {'name': 'Org Lvl 4'}]"}})

Update:

After trying this I am able to print 'Org Lvl 4' for index 0 but now I need to apply this to the entire column.

import ast
print(df['org'].astype(str).map(ast.literal_eval)[0][3].get('name'))

Any ideas?

mmera
  • 327
  • 5
  • 17

2 Answers2

0

Not a solid answer, but to offer something...

Look at the answers at: How to flatten a pandas dataframe with some columns as json?

For example:

import ast

def list_of_dicts(ld):
    '''
    Create a mapping of the tuples formed after 
    converting json strings of list to a python list   
    '''
    return dict([(list(d.values())[0], list(d.values())[0]) for d in ast.literal_eval(ld)])

df = pd.DataFrame({'org [dtype: Object]': {0: "[{'name': 'Org Lvl 1'}, {'name': 'Org Lvl 2'}, {'name': 'Org Lvl 3'}, {'name': 'Org Lvl 4'}]"}})

B = pd.json_normalize(df['org [dtype: Object]'].apply(list_of_dicts).tolist()) #.add_prefix('dict_') 

print(df, '\n\n')

print(B)

>>                                  org [dtype: Object]
>> 0  [{'name': 'Org Lvl 1'}, {'name': 'Org Lvl 2'},... 


>>    Org Lvl 1  Org Lvl 2  Org Lvl 3  Org Lvl 4
>> 0  Org Lvl 1  Org Lvl 2  Org Lvl 3  Org Lvl 4
MDR
  • 2,610
  • 1
  • 8
  • 18
0

I was able to solve using lambda:

df['org'] = df.apply(lambda row: list(row.org)[-1]['name'])
mmera
  • 327
  • 5
  • 17