Convert list into columns by matching values

Question

I have a pandas dataframe like so:

df = pd.DataFrame({'column': [[np.nan, np.nan, np.nan], [1, np.nan, np.nan], [2, 3, np.nan], [3, 2, 1]]})

column
0   [nan, nan, nan]
1   [1, nan, nan]
2   [2, 3, nan]
3   [3, 2, 1]

Note that there is never the same value twice in a row.

I wish to transform this single column into multiple columns named with the corresponding values. So I want to order the values and put them in the right column. The ones under column_1, twos under column_2 etc.

    column_1    column_2    column_3
0   NaN NaN NaN
1   1.0 NaN NaN
2   NaN 2.0 3.0
3   1.0 2.0 3.0

How to do this? I don't really know where to start to be honest.

Maybe `df = pd.DataFrame(df['column'].to_list(), columns=['column_1','column_2', 'column_3'])`..this question is probably a duplicate, but not sure with which question — AloneTogether, Nov 17 '22 at 08:58
This does not give the desired output. Note that in the third row, the `2` and `3` need to be shifted to the right — T C Molenaar, Nov 17 '22 at 08:59
What is the logic behind the shift? You seem to have forgotten that info? — AloneTogether, Nov 17 '22 at 09:00
I want the `2` in `column_2` and the `3` in `column_3`. As there is no `1` in this row, there needs to be a `nan` there — T C Molenaar, Nov 17 '22 at 09:01
Maybe... https://stackoverflow.com/questions/35491274/split-a-pandas-column-of-lists-into-multiple-columns ? — MDR, Nov 17 '22 at 09:03
similar as above, this does not solve my question. What if in one of the rows, the values were the other way around? How can you sort them into the right column? — T C Molenaar, Nov 17 '22 at 09:07
You may want to edit the question with a few examples of different number orders and explain how they need to be handled. — MDR, Nov 17 '22 at 09:09
@Clegane yes, the values in my dataframe are values 1,...,n and I want to transform them into a dataframe with columns with suffix _1,...,_n — T C Molenaar, Nov 17 '22 at 09:17

score 1 · Accepted Answer · answered Nov 17 '22 at 10:05

Use dict comprehension to compute pandas Series of columns:

import math
df = df.apply(lambda row: pd.Series(data={f"column_{v}": v for v in row["column"] if not math.isnan(v)}, dtype="float64"), axis=1)

[Out]:
   column_1  column_2  column_3
0       NaN       NaN       NaN
1       1.0       NaN       NaN
2       NaN       2.0       3.0
3       1.0       2.0       3.0

score 0 · Answer 2 · answered Nov 17 '22 at 10:44

Using a pivot_table:

(df['column'].explode().reset_index()
 .dropna()
 .assign(col=lambda d: 'column_'+d['column'].astype(str))
 .pivot_table(index='index', columns='col', values='column',
              aggfunc='first', dropna=False)
 .reindex(df.index)
)

Output:

col  column_1  column_2  column_3
0         NaN       NaN       NaN
1         1.0       NaN       NaN
2         NaN       2.0       3.0
3         1.0       2.0       3.0

Convert list into columns by matching values

2 Answers2