3

Basically I have a dataframe with lists that have been read in as strings and I would like to convert them back to lists.

Below shows what I am currently doing but I m still learning and feel like there must be a better (more efficient/Pythonic) way to go about this. Any help/constructive criticism would be much appreciated!

import pandas as pd
import ast

df = pd.DataFrame(data=['[-1,0]', '[1]', '[1,2]'], columns = ['example'])
type(df['example'][0])
>> str

n = df.shape[0]
temp = []
temp2 = []

for i in range(n):
    temp = (ast.literal_eval(df['example'][i]))
    temp2.append(temp)

df['new_col_lists'] = temp2
type(df['new_col_lists'][0])
>> list
Sean
  • 47
  • 1
  • 6
  • If this is working, and you're just looking to optimize, I suggest posting instead to [CodeReview](https://codereview.stackexchange.com/) – BruceWayne Aug 25 '18 at 15:28
  • Thank you @BruceWayne that is good to know and I will keep it in mind for future! – Sean Aug 25 '18 at 16:24

3 Answers3

3

Maybe you could use a map:

df['example'] = df['example'].map(ast.literal_eval)

With pandas, there is almost always a way to avoid the for loop.

kevh
  • 323
  • 2
  • 6
  • Something like this was exactly what I was looking for, thank you! There is another answer below which uses `.apply()` instead of `.map()`. Are they interchangeable in this case? Or is one better than the other? – Sean Aug 25 '18 at 15:58
  • 1
    I found this post explaining the difference between `.map()` and `.apply()` in case anyone sees this post and wonders the same thing: https://stackoverflow.com/questions/19798153/difference-between-map-applymap-and-apply-methods-in-pandas Thanks again! – Sean Aug 26 '18 at 09:31
2

You can use .apply

Ex:

import pandas as pd
import ast

df = pd.DataFrame(data=['[-1,0]', '[1]', '[1,2]'], columns = ['example'])
df['example'] = df['example'].apply(ast.literal_eval)
print( type(df['example'][0]) )

Output:

<type 'list'>
Rakesh
  • 81,458
  • 17
  • 76
  • 113
0

You could use apply with a lambda which splits and converts your strings:

df['new_col_lists'] = df['example'].apply(lambda s: [int(v.strip()) for v in s[1:-1].split(',')])

Use float cast instead of int if needed.

SpghttCd
  • 10,510
  • 2
  • 20
  • 25