Splitting a dictionary like object into new columns

Question

I have this dataframe. How I got this dataframe was due to using .to_dict(). It manages to split the initial dictionary into individual columns on key:value pairs. However some keys has a dictionary like value. For instance like the Ambience column here.

I believe the initial dictionary is something like this

attribute = {
    "AcceptsInsurance": "null",
    "AgesAllowed": "null',
    "Ambience": {
        'touristy': False,
        'hipster': False,
        'romantic': False,
        'divey': False,
        'intimate': False
      }
    } ...

To be exact,

The column value are in a dictionary like object but are not dictionary. The column are in object type.

+---------------+--------------------------------------------------------------------------------------------+                  
|    business_id|                                                                                    Ambience|  
+---------------+--------------------------------------------------------------------------------------------+
|6iYb2HFDywm3zjw|{'touristy': False, 'hipster': False, 'romantic': False, 'divey': False, 'intimate': False} | 
|drRPZA0oiIYSmqs|{'romantic': False, 'intimate': False, 'classy': False, False, 'casual': True}              |  
+---------------+-------------+------------------------------------------------------------------------------+

I would like to hence further split it and create new column for each of the element of the key:value pair. So something like this,

+-----------------+------------------+-------------------+
|      business_id| Ambience_touristy|  Ambience_romantic|
+-----------------+------------------+-------------------+
|  6iYb2HFDywm3zjw|             False|              False|
|  drRPZA0oiIYSmqs|              Null|              False|
+-----------------+------------------+-------------------+

How do I go about doing this? I am new to this and would appreciate any help.

could you share the dictionary that constructs the dataframe you have here? — , Feb 05 '22 at 06:12

mechanical_meat · Accepted Answer · 2022-02-05T06:55:13.293

re-using a function from another SO answer called unnesting() with modifications. edit: please note that the string in row 2 is not a valid Python dictionary. this solution assumes valid Python dictionaries.

def unnesting(df, explode):
    """ modified from here: https://stackoverflow.com/a/53218939/42346 """
    df1 = pd.concat( 
        [ 
            pd.DataFrame(df[x].tolist(), index=df.index).add_prefix(x + "_") 
            if x == "Ambience" 
            else pd.DataFrame(df[x], index=df.index) 
            for x in explode 
        ], 
        axis=1, 
    ) 
    return df1.join(df.drop(explode, 1), how="left")

you can then do:

import ast

df["Ambience"] = df["Ambience"].apply(ast.literal_eval) 
df_unnested = unnesting(df, ["business_id", "Ambience"])

result:

       business_id Ambience_touristy Ambience_hipster  Ambience_romantic  Ambience_divey  Ambience_intimate Ambience_classy Ambience_casual
1  6iYb2HFDywm3zjw             False            False              False           False              False             NaN             NaN
2  drRPZA0oiIYSmqs               NaN              NaN              False           False              False           False            True

thank you for replying. I am however receiving an 'malformed node or string: None' error — reign, Feb 05 '22 at 06:51
@reign: that occurs because the example dictionary in the 2nd row is not a valid Python dictionary. be sure all of your dictionaries are valid Python dictionaries. — mechanical_meat, Feb 05 '22 at 06:51

Splitting a dictionary like object into new columns

1 Answers1