7

I have the following pandas df:

      original      mean
   0  0.000000  0.065500
   1  0.131000  0.135890
   2  0.140779  0.144875
   3  0.148971  0.150029
   4  0.151088  0.144309

How can I merge the 2 columns to be like this:

      original
   0  0.000000
   1  0.065500
   2  0.131000
   3  0.135890
   4  0.140779  
   5  0.144875
   6  0.148971  
   7  0.150029
   8  0.151088  
   9  0.144309
wi3o
  • 1,467
  • 3
  • 17
  • 29

2 Answers2

6

use stack() method:

In [2]: df
Out[2]:
   original      mean
0  0.000000  0.065500
1  0.131000  0.135890
2  0.140779  0.144875
3  0.148971  0.150029
4  0.151088  0.144309

In [3]: df.stack()
Out[3]:
0  original    0.000000
   mean        0.065500
1  original    0.131000
   mean        0.135890
2  original    0.140779
   mean        0.144875
3  original    0.148971
   mean        0.150029
4  original    0.151088
   mean        0.144309
dtype: float64

In [4]: df.stack().reset_index(level=[0,1], drop=True)
Out[4]:
0    0.000000
1    0.065500
2    0.131000
3    0.135890
4    0.140779
5    0.144875
6    0.148971
7    0.150029
8    0.151088
9    0.144309
dtype: float64
MaxU - stand with Ukraine
  • 205,989
  • 36
  • 386
  • 419
4

You can call reshape on the values and construct another df:

In [7]:
pd.DataFrame(data=df.values.reshape(df.shape[0]*2,-1), columns=['original'])

Out[7]:
   original
0  0.000000
1  0.065500
2  0.131000
3  0.135890
4  0.140779
5  0.144875
6  0.148971
7  0.150029
8  0.151088
9  0.144309

Timings

On your sample dataset:

In [8]:
%timeit df.stack().reset_index(level=[0,1], drop=True)
%timeit pd.DataFrame(data=df.values.reshape(df.shape[0]*2,-1), columns=['original'])

1000 loops, best of 3: 820 µs per loop
1000 loops, best of 3: 446 µs per loop

reshaping on the numpy arrays is nearly twice as fast here

EdChum
  • 376,765
  • 198
  • 813
  • 562