Turning a string into a list of numbers in Pandas

Question

After running an SQL query I get a data frame with one column that looks like this:

d = {'values': ['2,3,2,2', '2,3,5', '2', '3,4,2,2,2']}
df = pd.DataFrame(data=d)

What I need to do is turn each of this strings into a list of numbers:

d = {'values': [[2, 3, 2, 2], [2, 3, 5], [2], [3, 4, 2, 2, 2]]}

Is there a pythonic way to do that?

expected output is new column or dictionary ? – jezrael Jul 02 '20 at 08:05 — jezrael, Jul 02 '20 at 08:05
A new column in the same data frame. – Vasily Chernukha Jul 02 '20 at 08:30 — Vasily Chernukha, Jul 02 '20 at 08:30

Umar.H · Accepted Answer · 2020-07-02T08:41:52.780

3

Here's another way using pd.eval and explode

Hat tip to Jezrael (for the eval thread) and cs95 (for the pd.Eval tip) in the comments, read :

Why using eval is bad practice

df['values_agg'] = df['values'].map(pd.eval)

print(df)

      values       values_agg
0    2,3,2,2     [2, 3, 2, 2]
1      2,3,5        [2, 3, 5]
2          2              [2]
3  3,4,2,2,2  [3, 4, 2, 2, 2]

or avoiding eval all together in a more verbose way.

df['values'].str.split(',').explode().astype(int).groupby(level=0).agg(list)

edited Jul 02 '20 at 08:41

answered Jul 02 '20 at 08:09

Umar.H

22,559
7
39
74

[Why is using 'eval' a bad practice?](https://stackoverflow.com/questions/1832940/why-is-using-eval-a-bad-practice) ? – jezrael Jul 02 '20 at 08:33
Here is better use `astype(int)` after `explode` – jezrael Jul 02 '20 at 08:33
1

Change `eval` to `pd.eval` and you are good — you won't even need the typecast – cs95 Jul 02 '20 at 08:37
@jezrael wow, that is very informative but I think it's more in-tune for applications I could change it to `str.split` then `explode.astype(int)` I guess ? – Umar.H Jul 02 '20 at 08:38
1

yop, exactly need this way. – jezrael Jul 02 '20 at 08:39
1

no I mean `df['values'].map(pd.eval)`, you don't need the other stuff. The 3rd row will remain a scalar instead of a list, big deal. – cs95 Jul 02 '20 at 08:41
@cs95 gracias mes ami, wow that's amazing didn't know that. thanks for the tip! – Umar.H Jul 02 '20 at 08:41

score 2 · Answer 2 · answered Jul 02 '20 at 07:59

2

Using pandas.Series.str.findall:

df["values"] = df["values"].str.findall("\d+")
d2 = df.to_dict("list")
print(d2)

Output:

{'values': [['2', '3', '2', '2'], 
            ['2', '3', '5'], 
            ['2'], 
            ['3', '4', '2', '2', '2']]}

answered Jul 02 '20 at 07:59

Chris

29,127
3
28
51

jezrael · Answer 3 · 2020-07-02T08:08:56.840

Because need numbers after split comvert values to integers:

df['values'] = df['values'].apply(lambda x: [int(y) for y in x.split(',')])
print (df)
            values
0     [2, 3, 2, 2]
1        [2, 3, 5]
2              [2]
3  [3, 4, 2, 2, 2]


print (df.to_dict("l"))
{'values': [[2, 3, 2, 2], [2, 3, 5], [2], [3, 4, 2, 2, 2]]}

For strings repr of numbers solution is simplier, only use Series.str.split:

df['values'] = df['values'].str.split(',')
print (df.to_dict("l"))
{'values': [['2', '3', '2', '2'], ['2', '3', '5'], ['2'], ['3', '4', '2', '2', '2']]}

score 1 · Answer 4 · answered Jul 02 '20 at 08:03

1

You could do with a list comprehension:

d = {'values': [list(map(int, x.split(','))) for x in df['values']]}

[out]

{'values': [[2, 3, 2, 2], [2, 3, 5], [2], [3, 4, 2, 2, 2]]}

answered Jul 02 '20 at 08:03

Chris Adams

18,389
4
22
39

Turning a string into a list of numbers in Pandas

4 Answers4