-1

I've been trying to work out on extracting elements from a dictionary in a Pandas column, and put these items into a couple of new columns.

I have a DataFrame consisting of 2 columns, i.e. ID and data (dictionary).

ID  data
0   6602629924  {'@status': 'found', '@_fa': 'true', 'coredata...
1   55599317400 {'@status': 'found', '@_fa': 'true', 'coredata...
2   25652391600 {'@status': 'found', '@_fa': 'true', 'coredata...
3   11939875400 {'@status': 'found', '@_fa': 'true', 'coredata...
4   56140547500 {'@status': 'found', '@_fa': 'true', 'coredata...

If I wanted to extract an "affiliation" from a row, for example, I'd call it using this line of code:

data[1]["author-profile"]["affiliation-current"]["affiliation"]["ip-doc"]["afdispname"],

which returns 'De La Salle University'.

But when it comes to the whole column, it doesn't work.

new_df["affiliation"] = new_df['data']["author-profile"]["affiliation-current"]["affiliation"]["ip-doc"]["afdispname"]
new_df


 ---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-145-8abd5f976526> in <module>
----> 1 new_df["affiliation"] = new_df['data']["author-profile"]["affiliation-current"]["affiliation"]["ip-doc"]["afdispname"]
      2 new_df

C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\series.py in __getitem__(self, key)
    866         key = com.apply_if_callable(key, self)
    867         try:
--> 868             result = self.index.get_value(self, key)
    869 
    870             if not is_scalar(result):

C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\indexes\base.py in get_value(self, series, key)
   4373         try:
   4374             return self._engine.get_value(s, k,
-> 4375                                           tz=getattr(series.dtype, 'tz', None))
   4376         except KeyError as e1:
   4377             if len(self) > 0 and (self.holds_integer() or self.is_boolean()):

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_value()

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_value()

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/index_class_helper.pxi in pandas._libs.index.Int64Engine._check_type()

KeyError: 'author-profile'

What have I done wrong?

Ken.WS
  • 1
  • 1
  • 2
    Please, post the output directly as text instead of an image. – Antoine Delia Nov 15 '22 at 07:52
  • [How to create a Minimal, Reproducible Example](https://stackoverflow.com/help/minimal-reproducible-example) & [How to make good reproducible pandas examples](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) – BeRT2me Nov 15 '22 at 07:56

1 Answers1

0

use json_normalize()

new_df=new_df.join(pd.json_normalize(new_df.pop('data')))
Bushmaster
  • 4,196
  • 3
  • 8
  • 28
  • Wow, it works! Thank you for your ultrafast response. – Ken.WS Nov 15 '22 at 08:13
  • You're welcome. In your next questions, please pay attention to the parts specified in the comments [How to make good reproducible pandas examples](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples). – Bushmaster Nov 15 '22 at 08:17