0

So I have a list where each entry looks something like this:

"{'A': 1, 'B': 2, 'C': 3}"

I am trying to get a dataframe that looks like this

    A   B   C
0   1   2   3
1   4   5   6 
2   7   8   9

But I'm having trouble converting the format into something that can be read into a DataFrame. I know that pandas should automatically convert dicts into dataframes, but since my list elements are surrounded by quotes, it's getting confused and giving me

               0
0  {'A': 1, 'B': 2, 'C': 3}
...

I've tried using using json, concat'ing a list of dataframes, and so on, but to no avail.

salamander
  • 181
  • 1
  • 3
  • 15
  • when converting from dictionary to dataframe, typically how it works is the key is a string and the value is a list. The list values then become the rows of the Dataframe while they key is the column header. in your case, the values are only integers. Here's the documentation for it: https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.from_dict.html – Mitchnoff Aug 08 '22 at 15:48

5 Answers5

2

eval is not safe. Check this comparison.

Instead use ast.literal_eval:

Assuming this to be your list:

In [572]: l = ["{'A': 1, 'B': 2, 'C': 3}", "{'A': 4, 'B': 5, 'C': 6}"]

In [584]: import ast
In [587]: df = pd.DataFrame([ast.literal_eval(i) for i in l])

In [588]: df
Out[588]: 
   A  B  C
0  1  2  3
1  4  5  6
Mayank Porwal
  • 33,470
  • 8
  • 37
  • 58
1

Use eval before reading it in dataframe:

pd.DataFrame([eval(s) for s in l])

Or better use ast.literal_eval as @Mayank Porwal's answer says.

Or use json.loads but after making sure its valid json

SomeDude
  • 13,876
  • 5
  • 21
  • 44
1

I would agree with the SomeDude that eval will work like this

pd.DataFrame([eval(s) for s in l])  

BUT, if any user entered data is going into these strings, you should never use eval. Instead, you can convert the single quotes to double quotes and use the following syntax from the json package. This is much safer.

json.loads(u'{"A": 1, "B": 2, "C": 3}')
Ryan Folks
  • 68
  • 6
  • 1
    Your string is different than the OP's though. His string is surrounded by double quotes, which disables `json.loads` to convert it into a dictionary as `json.loads` expect the string to be surrounded by double quotes – Nuri Taş Aug 08 '22 at 16:03
  • True. I think it would be worth the hassle to write a loop to convert the single quotes to double to avoid the potential badness that can happen with eval. Unless for some reason OP needs the single quotes over double quotes. – Ryan Folks Aug 08 '22 at 16:12
  • Converting single quotes to double quotes or vice versa is not as easy as it sounds. I would like to see your code that does this job. – Nuri Taş Aug 08 '22 at 16:15
  • 2
    `''.join(['\"' if i=='\'' else i for i in "{'A':1, 'B':2, 'C':3}"])` – Ryan Folks Aug 08 '22 at 16:19
  • 1
    Fair enough. Upvote. – Nuri Taş Aug 08 '22 at 16:26
0

Take your list of strings, turn it into a list of dictionaries, then construct the data frame using pd.DataFrame.from_records.

>>> l = ["{'A': 1, 'B': 2, 'C': 3}"]
>>> pd.DataFrame.from_records(eval(s) for s in l) 
   A  B  C
0  1  2  3

Check that your input data doesn't include Python code, however, because eval is just to evaluate the input. Using it on something web-facing or something like that would be a severe security flaw.

ifly6
  • 5,003
  • 2
  • 24
  • 47
  • This should work, but I oversimplified my example. The value for one of the keys is an array object, i.e. `array([0.00248667])` so while it is parsing the string into a dict, then I am getting `NameError: name 'array' is not defined`. Not sure if I should make a new question or update my original. – salamander Aug 08 '22 at 16:47
0

You can try

lst = ["{'A': 1, 'B': 2, 'C': 3}", "{'A': 1, 'B': 2, 'C': 3}"]


df = pd.DataFrame(map(eval, lst))

# or

df = pd.DataFrame(lst)[0].apply(eval).apply(pd.Series)

# or

df = pd.DataFrame(lst)[0].apply(lambda x: pd.Series(eval(x)))
print(df)

   A  B  C
0  1  2  3
1  1  2  3
Ynjxsjmh
  • 28,441
  • 6
  • 34
  • 52