Split columns of lists into multiple columns in pandas

Question

A similar question was already asked before Split a Pandas column of lists, and it dealt with splitting a single column of a nested list into multiple columns.

My case is slightly different. lets say I have a dframe with multiple columns containing nested lists, I am seeking for a solution to split those nested lists into multiple columns.

dframe:

    0                           1
0   [u, 8.000000e+00, 4.47e-01] [a0, 3.384351e-03, 1.23e-03]
1   [u, 8.000000e+00, 4.47e-01] [a0, 3.384351e-03, 1.23e-03]  
2   [u, 8.000000e+00, 5.53e-01] [a0, 4.897271e-03, 1.79e-03]

I tried most of the methods suggested in the post above Split a Pandas column of lists into multiple columns:

pd.DataFrame(dframe[0].to_list(), columns=['u','val', 'err'])

basically, they did not work for me as these methods seem meant to be for a single column.

What I expect is something like this:

Output:

    0   1               2               3   4               5

0   u   8.000000e+00    4.47e-01        a0  3.384351e-03    1.23e-03
1   u   8.000000e+00    4.47e-01        a0  3.384351e-03    1.23e-03
2   u   8.000000e+00    5.53e-01        a0  4.897271e-03    1.79e-03

I have a hard time to solve this issue for a couple of days, I would really appreciate your kind response.

Andrej Kesely · Answer 1 · 2023-03-15T21:52:01.690

1

You can try:

out = df[[0, 1]].apply(lambda x: x[0] + x[1], axis=1, result_type="expand")
print(out)

Prints:

   0    1      2   3         4        5
0  u  8.0  0.447  a0  0.003384  0.00123
1  u  8.0  0.447  a0  0.003384  0.00123
2  u  8.0  0.553  a0  0.004897  0.00179

EDIT:

out = df[[0, 1]].apply(lambda x:  [v for lst in x for v in lst], axis=1, result_type="expand")
print(out)

edited Mar 15 '23 at 21:52

answered Mar 15 '23 at 21:17

Andrej Kesely

168,389
15
48
91

1

thanks @Andrej, it works perfect. But if I extend the `dframe` to more than two columns, lets say four columns, then we have to change the code slightly. `df.apply(lambda x: x[0] + x[1] +x[2] + x[3], axis=1, result_type="expand")` is there a any simpler way for doing it? – aVral Mar 15 '23 at 21:50
@aVral I've edited my answer and added a solution where you don't have to add the columns separately. – Andrej Kesely Mar 15 '23 at 21:52

RomanPerekhrest · Answer 2 · 2023-03-15T22:00:56.787

1

By means of itertools.chain function (to flatten inner lists):

from itertools import chain

df = pd.DataFrame([chain.from_iterable(a) for a in df.values])

Or just with nested list comprehension:

df = pd.DataFrame([[v for lst in arr for v in lst] for arr in df.values])

   0    1      2   3         4        5
0  u  8.0  0.447  a0  0.003384  0.00123
1  u  8.0  0.447  a0  0.003384  0.00123
2  u  8.0  0.553  a0  0.004897  0.00179

edited Mar 15 '23 at 22:00

answered Mar 15 '23 at 21:55

RomanPerekhrest

88,541
4
65
105

Thanks@Roman, the second approach seems more intuitive. – aVral Mar 16 '23 at 07:14

rhug123 · Answer 3 · 2023-03-18T13:13:11.723

0

Try this:

You can add the lists together and pass them into pd.DataFrame()

pd.DataFrame(df.sum(axis=1).tolist())

edited Mar 18 '23 at 13:13

answered Mar 17 '23 at 21:56

rhug123

7,893
1
9
24

1

Adding more description about your code, will be more helpful. _'Your answer can say “don’t do that,” but it should also say “try this instead.” Any answer that fully addresses at least part of the question is helpful and can get the asker going in the right direction. State any limitations, assumptions or simplifications in your answer. Brevity is acceptable, but fuller explanations are better.'_ – imxitiz Mar 18 '23 at 06:36

Split columns of lists into multiple columns in pandas

3 Answers3