Convert pandas data frame to series

Question

I'm somewhat new to pandas. I have a pandas data frame that is 1 row by 23 columns.

I want to convert this into a series. I'm wondering what the most pythonic way to do this is?

I've tried pd.Series(myResults) but it complains ValueError: cannot copy sequence with size 23 to array axis with dimension 1. It's not smart enough to realize it's still a "vector" in math terms.

Alexander · Answer 1 · 2021-04-29T19:59:35.163

106

You can transpose the single-row dataframe (which still results in a dataframe) and then squeeze the results into a series (the inverse of to_frame).

df = pd.DataFrame([list(range(5))], columns=["a{}".format(i) for i in range(5)])

>>> df.squeeze(axis=0)
a0    0
a1    1
a2    2
a3    3
a4    4
Name: 0, dtype: int64

Note: To accommodate the point raised by @IanS (even though it is not in the OP's question), test for the dataframe's size. I am assuming that df is a dataframe, but the edge cases are an empty dataframe, a dataframe of shape (1, 1), and a dataframe with more than one row in which case the use should implement their desired functionality.

if df.empty:
    # Empty dataframe, so convert to empty Series.
    result = pd.Series()
elif df.shape == (1, 1)
    # DataFrame with one value, so convert to series with appropriate index.
    result = pd.Series(df.iat[0, 0], index=df.columns)
elif len(df) == 1:
    # Convert to series per OP's question.
    result = df.T.squeeze()
else:
    # Dataframe with multiple rows.  Implement desired behavior.
    pass

This can also be simplified along the lines of the answer provided by @themachinist.

if len(df) > 1:
    # Dataframe with multiple rows.  Implement desired behavior.
    pass
else:
    result = pd.Series() if df.empty else df.iloc[0, :]

edited Apr 29 '21 at 19:59

answered Oct 23 '16 at 22:20

Alexander

105,104
32
201
196

15

Note that I ran into a small issue using `squeeze`. For a dataframe of shape `(1, 1)` it will return, not a series of length 1, but a numpy scalar. This led to a hard-to-catch bug when using `squeeze` on objects of unknown length (e.g. with `groupby`). – IanS Jan 12 '17 at 14:54
2

"Thank you! df.squeeze() worked when df.iloc[:,0] & df.ix[:,0] both produced too many indexes error" – Afflatus Feb 25 '17 at 18:06
4

And why is the inverse of `to_frame` not `to_series` or `pd.Series(df)` ...? – Eike P. Apr 11 '18 at 14:14
5

You don't need `.T` – elgehelge Oct 24 '18 at 14:22
4

@IanS pass the argument `df.squeeze(axis=0)` or `df.squeeze(axis=1)` (depending on the axis you want to conserve) to avoid that – Nicolas Fonteyne Oct 29 '20 at 12:46
I think it is wise to always specify the axis to squeeze – Levi Baguley Apr 29 '21 at 17:24

score 82 · Answer 2 · answered Oct 20 '15 at 21:21

82

It's not smart enough to realize it's still a "vector" in math terms.

Say rather that it's smart enough to recognize a difference in dimensionality. :-)

I think the simplest thing you can do is select that row positionally using iloc, which gives you a Series with the columns as the new index and the values as the values:

>>> df = pd.DataFrame([list(range(5))], columns=["a{}".format(i) for i in range(5)])
>>> df
   a0  a1  a2  a3  a4
0   0   1   2   3   4
>>> df.iloc[0]
a0    0
a1    1
a2    2
a3    3
a4    4
Name: 0, dtype: int64
>>> type(_)
<class 'pandas.core.series.Series'>

answered Oct 20 '15 at 21:21

DSM

342,061
65
592
494

2

Or, another way: `df.T` – ako Oct 20 '15 at 21:35
16

@ako: `df.T` doesn't produce a Series, though, just a transposed DataFrame. – DSM Oct 20 '15 at 21:38
1

@DSM. That's true, df.T.iloc[0] – Antonio Andrés Sep 14 '20 at 13:32
1

The only problem with using `df.iloc` is that if you have an empty df, this will raise an `IndexError`. To avoid that, after transposing your df, use the `df.squeeze` method. Ref. to https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.squeeze.html – Nicolas Fonteyne Oct 29 '20 at 12:42
@DSM Yes, using `df.iloc` is the best way! – Sherman Chen Apr 04 '23 at 09:39

score 38 · Answer 3 · answered Oct 20 '15 at 21:33

You can retrieve the series through slicing your dataframe using one of these two methods:

http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.iloc.html http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.loc.html

import pandas as pd
import numpy as np
df = pd.DataFrame(data=np.random.randn(1,8))

series1=df.iloc[0,:]
type(series1)
pandas.core.series.Series

score 11 · Accepted Answer · edited Jun 04 '21 at 02:33

11

If you have a one column dataframe df, you can convert it to a series:

df.iloc[:,0]  # pandas Series

Since you have a one row dataframe df, you can transpose it so you're in the previous case:

df.T.iloc[:,0]

edited Jun 04 '21 at 02:33

talonmies

70,661
34
192
269

answered Mar 24 '21 at 16:27

nicolass

526
5
8

score 10 · Answer 5 · edited Jun 29 '20 at 11:58

10

You can also use stack()

df= DataFrame([list(range(5))], columns = [“a{}”.format(I) for I in range(5)])

After u run df, then run:

df.stack()

You obtain your dataframe in series

edited Jun 29 '20 at 11:58

Ersoy

8,816
6
34
48

answered Jun 29 '20 at 11:49

Clever Omo

101
1
2

`stack()` is the only solution robust enough not to return a single element instead of the expected single column... – mirekphd Aug 29 '21 at 10:05

score 6 · Answer 6 · edited Feb 06 '21 at 21:41

6

Another way -

Suppose myResult is the dataFrame that contains your data in the form of 1 col and 23 rows

# label your columns by passing a list of names
myResult.columns = ['firstCol']

# fetch the column in this way, which will return you a series
myResult = myResult['firstCol']

print(type(myResult))

In similar fashion, you can get series from Dataframe with multiple columns.

edited Feb 06 '21 at 21:41

flyingdutchman

1,197
11
17

answered Jan 15 '19 at 03:51

Tauseef Malik

128
1
7

score 1 · Answer 7 · answered Oct 17 '19 at 05:57

1

data = pd.DataFrame({"a":[1,2,3,34],"b":[5,6,7,8]})
new_data = pd.melt(data)
new_data.set_index("variable", inplace=True)

This gives a dataframe with index as column name of data and all data are present in "values" column

answered Oct 17 '19 at 05:57

user12230680

19
1

6

Welcome to Stack Overflow! How does this answer the question? Your code doesn't return a Series like the question asks – Gricey Oct 17 '19 at 06:28

score 1 · Answer 8 · edited Jun 12 '22 at 14:06

1

Another way is very simple

df= df.iloc[3].reset_index(drop=True).squeeze()

Squeeze -> is the one that converts to Series.

edited Jun 12 '22 at 14:06

Suraj Rao

29,388
11
94
103

answered Jun 12 '22 at 13:51

Oscar Rangel

848
1
10
18

Convert pandas data frame to series

8 Answers8

Linked