148

I'm somewhat new to pandas. I have a pandas data frame that is 1 row by 23 columns.

I want to convert this into a series. I'm wondering what the most pythonic way to do this is?

I've tried pd.Series(myResults) but it complains ValueError: cannot copy sequence with size 23 to array axis with dimension 1. It's not smart enough to realize it's still a "vector" in math terms.

raiyan22
  • 1,043
  • 10
  • 20
user1357015
  • 11,168
  • 22
  • 66
  • 111

8 Answers8

106

You can transpose the single-row dataframe (which still results in a dataframe) and then squeeze the results into a series (the inverse of to_frame).

df = pd.DataFrame([list(range(5))], columns=["a{}".format(i) for i in range(5)])

>>> df.squeeze(axis=0)
a0    0
a1    1
a2    2
a3    3
a4    4
Name: 0, dtype: int64

Note: To accommodate the point raised by @IanS (even though it is not in the OP's question), test for the dataframe's size. I am assuming that df is a dataframe, but the edge cases are an empty dataframe, a dataframe of shape (1, 1), and a dataframe with more than one row in which case the use should implement their desired functionality.

if df.empty:
    # Empty dataframe, so convert to empty Series.
    result = pd.Series()
elif df.shape == (1, 1)
    # DataFrame with one value, so convert to series with appropriate index.
    result = pd.Series(df.iat[0, 0], index=df.columns)
elif len(df) == 1:
    # Convert to series per OP's question.
    result = df.T.squeeze()
else:
    # Dataframe with multiple rows.  Implement desired behavior.
    pass

This can also be simplified along the lines of the answer provided by @themachinist.

if len(df) > 1:
    # Dataframe with multiple rows.  Implement desired behavior.
    pass
else:
    result = pd.Series() if df.empty else df.iloc[0, :]
Alexander
  • 105,104
  • 32
  • 201
  • 196
  • 15
    Note that I ran into a small issue using `squeeze`. For a dataframe of shape `(1, 1)` it will return, not a series of length 1, but a numpy scalar. This led to a hard-to-catch bug when using `squeeze` on objects of unknown length (e.g. with `groupby`). – IanS Jan 12 '17 at 14:54
  • 2
    "Thank you! df.squeeze() worked when df.iloc[:,0] & df.ix[:,0] both produced too many indexes error" – Afflatus Feb 25 '17 at 18:06
  • 4
    And why is the inverse of `to_frame` not `to_series` or `pd.Series(df)` ...? – Eike P. Apr 11 '18 at 14:14
  • 5
    You don't need `.T` – elgehelge Oct 24 '18 at 14:22
  • 4
    @IanS pass the argument `df.squeeze(axis=0)` or `df.squeeze(axis=1)` (depending on the axis you want to conserve) to avoid that – Nicolas Fonteyne Oct 29 '20 at 12:46
  • I think it is wise to always specify the axis to squeeze – Levi Baguley Apr 29 '21 at 17:24
82

It's not smart enough to realize it's still a "vector" in math terms.

Say rather that it's smart enough to recognize a difference in dimensionality. :-)

I think the simplest thing you can do is select that row positionally using iloc, which gives you a Series with the columns as the new index and the values as the values:

>>> df = pd.DataFrame([list(range(5))], columns=["a{}".format(i) for i in range(5)])
>>> df
   a0  a1  a2  a3  a4
0   0   1   2   3   4
>>> df.iloc[0]
a0    0
a1    1
a2    2
a3    3
a4    4
Name: 0, dtype: int64
>>> type(_)
<class 'pandas.core.series.Series'>
DSM
  • 342,061
  • 65
  • 592
  • 494
38

You can retrieve the series through slicing your dataframe using one of these two methods:

http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.iloc.html http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.loc.html

import pandas as pd
import numpy as np
df = pd.DataFrame(data=np.random.randn(1,8))

series1=df.iloc[0,:]
type(series1)
pandas.core.series.Series
themachinist
  • 1,413
  • 2
  • 17
  • 22
11

If you have a one column dataframe df, you can convert it to a series:

df.iloc[:,0]  # pandas Series

Since you have a one row dataframe df, you can transpose it so you're in the previous case:

df.T.iloc[:,0]
talonmies
  • 70,661
  • 34
  • 192
  • 269
nicolass
  • 526
  • 5
  • 8
10

You can also use stack()

df= DataFrame([list(range(5))], columns = [“a{}”.format(I) for I in range(5)])

After u run df, then run:

df.stack()

You obtain your dataframe in series

Ersoy
  • 8,816
  • 6
  • 34
  • 48
Clever Omo
  • 101
  • 1
  • 2
  • `stack()` is the only solution robust enough not to return a single element instead of the expected single column... – mirekphd Aug 29 '21 at 10:05
6

Another way -

Suppose myResult is the dataFrame that contains your data in the form of 1 col and 23 rows

# label your columns by passing a list of names
myResult.columns = ['firstCol']

# fetch the column in this way, which will return you a series
myResult = myResult['firstCol']

print(type(myResult))

In similar fashion, you can get series from Dataframe with multiple columns.

flyingdutchman
  • 1,197
  • 11
  • 17
Tauseef Malik
  • 128
  • 1
  • 7
1
data = pd.DataFrame({"a":[1,2,3,34],"b":[5,6,7,8]})
new_data = pd.melt(data)
new_data.set_index("variable", inplace=True)

This gives a dataframe with index as column name of data and all data are present in "values" column

  • 6
    Welcome to Stack Overflow! How does this answer the question? Your code doesn't return a Series like the question asks – Gricey Oct 17 '19 at 06:28
1

Another way is very simple

df= df.iloc[3].reset_index(drop=True).squeeze()

Squeeze -> is the one that converts to Series.

Suraj Rao
  • 29,388
  • 11
  • 94
  • 103
Oscar Rangel
  • 848
  • 1
  • 10
  • 18