-1

I have a column of dictionaries like this:

id                            element
 1  {'Colour': 'Grey', 'Brand': 'AB'}
 2   {'Colour': 'Blue', 'Brand': 'B'}
 3   {'Colour': 'Red', 'Brand': 'AH'}

And I want to create new columns from those dictionaries, like this:

id                            element  colour  brand
 1  {'Colour': 'Grey', 'Brand': 'AB'}    Grey     AB
 2   {'Colour': 'Blue', 'Brand': 'B'}    Blue      B
 3   {'Colour': 'Red', 'Brand': 'AH'}     Red     AH

I have done the following but it's not working:

def whatever(row):
    tmp_d = {}
    for d in row.values:
        for k in d.keys():
            if k in tmp_d.keys():
                tmp_d[k] += 1
            else:
                tmp_d[k] = 1
    return tmp_d
    

new_df.colour = df.groupby('element')'element'].apply(whatever).unstack().fillna(0)

Data:

data = {'id': [1, 2, 3],
 'element': ["{'Colour': 'Grey', 'Brand': 'AB'}",
  "{'Colour': 'Blue', 'Brand': 'B'}",
  "{'Colour': 'Red', 'Brand': 'AH'}"]}
Nikita Agarwal
  • 343
  • 1
  • 3
  • 13
  • A couple things: first, I'm not sure Pandas will even let you assign using dot syntax, but either way, use `new_df["colour"]` to create a the `"colour"` column of your DataFrame (I'm pretty sure you should've gotten a warning about that). Second, if you're applying a function row-wise on a DataFrame, use `axis=1` in the `.apply()` call. – ddejohn Feb 11 '22 at 06:32
  • Also, try to explain *what the problem is*, not just say "it's not working". – ddejohn Feb 11 '22 at 06:36

2 Answers2

0

Here you go:

df.join(pd.json_normalize(df.element))

Output:

   id                            element Colour Brand
0   1  {'Colour': 'Grey', 'Brand': 'AB'}   Grey    AB
1   2   {'Colour': 'Blue', 'Brand': 'B'}   Blue     B
2   3   {'Colour': 'Red', 'Brand': 'AH'}    Red    AH

EDIT: since your element column is actually comprised of strings that look like dictionaries, you'll need to convert them to real dictionaries first:

import ast

df["element"] = df["element"].apply(ast.literal_eval)

# Solution
new_df = df.join(pd.json_normalize(df.element))

Then you can use my solution.

ddejohn
  • 8,775
  • 3
  • 17
  • 30
0

First, you can convert the strings to dict objects. Then you can also use str.get method to get values under a specific key:

import ast
df['element'] = df['element'].apply(ast.literal_eval)
df = df.assign(**{key: df['element'].str.get(key) for key in ('Colour','Brand')})

Output:

   id                            element Colour Brand
0   1  {'Colour': 'Grey', 'Brand': 'AB'}   Grey    AB
1   2   {'Colour': 'Blue', 'Brand': 'B'}   Blue     B
2   3   {'Colour': 'Red', 'Brand': 'AH'}    Red    AH