1

I'm trying to extract values from a dictionary contained within list in a Pandas dataframe .Objective is to split the id key into multiple columns. Sample data is like :

Column_Header
[{'id': '498', 'relTypeId': 2'},{'id': '499', 'relTypeId': 3'}]
[{'id': '499', 'relTypeId': 3'},{'id': '500', 'relTypeId': 4'},{'id': '501', 'relTypeId': 5'}]

I have tried as below

list(map(lambda x: x["id"], df["Column_Header"]))

But getting error as following: "list indices must be integers or slices, not str". Desired o/p is :

col1|col2|col3
498 |499 |
499 |500 |501

Can some one please help ?

Akki
  • 33
  • 7

2 Answers2

1

We can do explode first then create the additional key with cumcount , and pivot

s=df.Column_Header.explode().str['id']
s=pd.crosstab(index=s.index,columns=s.groupby(level=0).cumcount(),values=s,aggfunc='sum')
Out[133]: 
col_0    0    1    2
row_0               
0      498  499  NaN
1      499  500  501
BENY
  • 317,841
  • 20
  • 164
  • 234
  • Thanks Yoben , I tried it but getting error as - "AttributeError: 'Series' object has no attribute 'explode" . Is explode function part of Panda library or I'm doing something wrong ? – Akki Jun 04 '20 at 14:51
  • @Akki explode is new after panda 0.25.0 check you panda V :-) – BENY Jun 04 '20 at 14:59
  • yes I'm on a lower version 0.20 , Unfortunately I can't update the pandas as I don't have admin rights ! Is there any other way you can suggest . – Akki Jun 05 '20 at 00:50
  • @Akki my self def func https://stackoverflow.com/questions/53218931/how-to-unnest-explode-a-column-in-a-pandas-dataframe/53218939#53218939 – BENY Jun 05 '20 at 00:50
0

Use nested list comprehension with select id in keys of dictionaries if performance is important:

df = pd.DataFrame([[y['id'] for y in x] for x in df['Column_Header']], index=df.index)
print (df)
     0    1     2
0  498  499  None
1  499  500   501

If possible some missing values use:

L = [[y['id'] for y in x] if  isinstance(x, list) else [None] for x in df['Column_Header']]
df = pd.DataFrame(L, index=df.index)
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
  • Thanks Jezrel, I tried your solution but getting error as follows - "TypeError: 'float' object is not iterable" – Akki Jun 04 '20 at 14:49