pandas dataframe to series

Question

I have this dataframe:

pd.DataFrame({"X": [1,2,3,4],
                   "Y": [5,6,7,8],
                   "Z": [9,10,11,12]})

And I'm looking for this output:

Currently, the similar problems solved I have found are the opposite: looking from series to dataframe. The most similar I have found is this one, which isn't similar at all. I have tried also with pivot_table() and reshape(), but they require an index column where I'm just looking for one column.

Any suggestions?

PS: You can assume that the dataframe has 100 columns to avoid selecting them one by one, but you call them as they are ordered (e.g. if they are 100 columns, you can do X1:X100)

Divakar · Accepted Answer · 2020-06-04T20:17:13.403

Use flattening with ravel('F') -

In [14]: pd.Series(df.to_numpy(copy=False).ravel('F'))
Out[14]: 
0      1
1      2
2      3
3      4
4      5
5      6
6      7
7      8
8      9
9     10
10    11
11    12
dtype: int64

This series is a view into the input dataframe, which means virtually free runtime and zero memory overhead. Let's verify -

In [20]: s = pd.Series(df.to_numpy(copy=False).ravel('F'))

In [21]: np.shares_memory(s,df)
Out[21]: True

Let's confirm the timings too -

In [2]: df = pd.DataFrame(np.random.rand(100000,3), columns=['X','Y','Z'])

In [3]: %timeit pd.Series(df.to_numpy(copy=False).ravel('F'))
579 µs ± 9.09 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

score 5 · Answer 2 · answered Jun 04 '20 at 20:11

5

This is melt:

df.melt()[['value']]

Output:

answered Jun 04 '20 at 20:11

Quang Hoang

146,074
10
56
74

score 4 · Answer 3 · answered Jun 04 '20 at 20:12

4

One way is to reshape the data from the "wide" to the "tall" format by stacking:

df.T.stack().reset_index(drop=True)
#0      1
#1      2
#2      3
#3      4
#4      5
#5      6
#6      7
#7      8
#8      9
#9     10
#10    11
#11    12

answered Jun 04 '20 at 20:12

DYZ

55,249
10
64
93

score 0 · Answer 4 · answered Jun 04 '20 at 20:17

0

As always, there are many ways to "skin a cat" in Pandas, and then performance may become the criterion. This is a meta-answer that compares the performance:

ravel by Divakar: 80 us
stack by DYZ: 640 us
melt by Quang Hoang: 2.03 ms

answered Jun 04 '20 at 20:17

DYZ

55,249
10
64
93

pandas dataframe to series

4 Answers4