Convert string containing list of dictionaries in DataFrame to list of dictionaries

Question

I'm having an issue that I haven't been able to solve for two days now, despite massive amount of googling. I've been downloading data from crunchbase.com. I stored the raw data in a DataFrame. However, one variable is stored as a string, which should really be a list of dictionaries.

Looking at a specific element of the pandas Series yields a string:

"[{'entity_def_id': 'category', 'permalink': 'media-and-entertainment', 'uuid': '78b58810-ad58-a623-2a80-2a0e3603a544', 'value': 'Media and Entertainment'}, {'entity_def_id': 'category', 'permalink': 'tv', 'uuid': '86d91a85-ff9d-93db-4688-3b608fee756c', 'value': 'TV'}, {'entity_def_id': 'category', 'permalink': 'tv-production', 'uuid': '47592b2e-aaaa-6aa3-d0e9-82ab5e525c2d', 'value': 'TV Production'}]"

The specific column in the DataFrame

Note that some observations in the Series in which this str of list of dicts is stored are missing (if that matters).

I would like to create new columns in my DataFrame where the column name corresponds to the key and for each observation the corresponding value from the dict; however, I don't know how to do that since it is a string, which I can only index with integers, rather than accessing the dictionaries directly. In fact, what

I've tried to use json.loads, which gives me a TypeError: the JSON object must be str, bytes or bytearray, not Series.

I also tried ast.literal_eval(), which gives me a ValueError: malformed node or string: 0.

Grateful for any hints and apologies if my formatting/style is not good, it's my first time posting here.

ast.literal_eval() works with your example string. Could you post the input string for which this function fails? — Captain Trojan, Oct 01 '20 at 12:37
For the string posted by you, ast.literal_evla() seems to be working fine. Can you please post some code for us to get more clarity about the question? — Ganesh Tata, Oct 01 '20 at 12:39

score 0 · Answer 1 · answered Oct 01 '20 at 12:40

Just use the eval() function

import pandas as pd

s = "[{'entity_def_id': 'category', 'permalink': 'media-and-entertainment', 'uuid': '78b58810-ad58-a623-2a80-2a0e3603a544', 'value': 'Media and Entertainment'}, {'entity_def_id': 'category', 'permalink': 'tv', 'uuid': '86d91a85-ff9d-93db-4688-3b608fee756c', 'value': 'TV'}, {'entity_def_id': 'category', 'permalink': 'tv-production', 'uuid': '47592b2e-aaaa-6aa3-d0e9-82ab5e525c2d', 'value': 'TV Production'}]"

l = eval(s)

df = pd.DataFrame(l)

Out[1]: 
  entity_def_id  ...                    value
0      category  ...  Media and Entertainment
1      category  ...                       TV
2      category  ...            TV Production

[3 rows x 4 columns]

Please consider https://stackoverflow.com/a/15197698/6361531 — Scott Boston, Oct 01 '20 at 12:45

Convert string containing list of dictionaries in DataFrame to list of dictionaries

1 Answers1