0

I would like one column to have all the other columns in the data frame combined.

here is what the dataframe looks like
  0   1   2
0 123 321 231
1 232 321 231
2 432 432 432

dataframe name = task_ba
I would like it to look like this
   0
0 123
1 232
2 432
3 321
4 321
5 432
6 231
7 231
8 432

2 Answers2

1

Easiest and fastest option, use the underlying numpy array:

df2 = pd.DataFrame(df.values.ravel(order='F'))

NB. If you prefer a series, use pd.Series instead

Output:

     0
0  123
1  232
2  432
3  321
4  321
5  432
6  231
7  231
8  432
mozway
  • 194,879
  • 13
  • 39
  • 75
  • `pd.DataFrame(df.values.flatten())` Gives a similar result~ or `pd.DataFrame(df.values.flatten('F'))` for the same result. – BeRT2me Apr 28 '22 at 22:22
  • @BeRT2me yes the two functions are doing the same (but `ravel` returns a view), yet you also need to pass `order='F'` to `flatten` to have the desired order. – mozway Apr 28 '22 at 22:25
  • I just timed it, why is `.ravel(order='F')` SO MUCH FASTER than `.flatten('F')`? – BeRT2me Apr 28 '22 at 22:34
  • 2
    @BeRT2me `flatten` makes a copy of the array, `ravel` returns a view of the original array (i.e. same object). It's better to use `ravel` as the DataFrame constructor will make a copy anyway. – mozway Apr 28 '22 at 22:37
0

You can use pd.DataFrame.melt() and then drop the variable column:

>>> df
     0    1    2
0  123  321  231
1  232  321  231
2  432  432  432

>>> df.melt().drop("variable", axis=1)  # Drops the 'variable' column
   value
0    123
1    232
2    432
3    321
4    321
5    432
6    231
7    231
8    432

Or if you want 0 as your column name:

>>> df.melt(value_name=0).drop("variable", axis=1)
     0
0  123
1  232
2  432
3  321
4  321
5  432
6  231
7  231
8  432

You can learn all this (and more!) in the official documentation.

ddejohn
  • 8,775
  • 3
  • 17
  • 30