How to get the first column of a pandas DataFrame as a Series?

Question

I tried:

x=pandas.DataFrame(...)
s = x.take([0], axis=1)

And s gets a DataFrame, not a Series.

herrfz · Accepted Answer · 2018-10-23T08:32:05.070

154

>>> import pandas as pd
>>> df = pd.DataFrame({'x' : [1, 2, 3, 4], 'y' : [4, 5, 6, 7]})
>>> df
   x  y
0  1  4
1  2  5
2  3  6
3  4  7
>>> s = df.ix[:,0]
>>> type(s)
<class 'pandas.core.series.Series'>
>>>

===========================================================================

UPDATE

If you're reading this after June 2017, ix has been deprecated in pandas 0.20.2, so don't use it. Use loc or iloc instead. See comments and other answers to this question.

edited Oct 23 '18 at 08:32

answered Mar 12 '13 at 13:33

herrfz

4,814
4
26
37

How can I get column 'y' as a Series and column 'x' as its index? – LWZ Jan 06 '14 at 02:01
5

`df.set_index('x').y` – herrfz Jan 10 '14 at 23:44
5

Would be worth adding the .iloc alternative (as proposed by Jeff further down on this page), as it is not ambiguous in the presence of columns with numbers for names. – sapo_cosmico Apr 16 '16 at 14:04
6

The answer was given in 2013; as far as I remember, `.iloc` wasn't there yet back then. In 2016, the correct answer is Jeff's (after all he's `pandas` God, mind you ;-)). I'm not sure what's SO's policy regarding update of answers due to API change; I'm honestly surprised by the number of votes for this answer, didn't think it was that useful to people... – herrfz Apr 18 '16 at 16:06
2

Another note: `ix` was [deprecated](http://pandas.pydata.org/pandas-docs/version/0.20/whatsnew.html#whatsnew-0200-api-breaking-deprecate-ix) in version 0.20. – ayhan Jun 17 '17 at 17:54
9

`ix` should not be used anymore, use `iloc` instead: `s = df.ix[:,0]`. See [this post](https://stackoverflow.com/a/31593712/3388962) for a comparison of `iloc` and `ix`. – normanius Oct 02 '17 at 14:54
Wouldn't an update that actually says what the update IS be better than just telling people to look elsewhere? – Diesel Mar 07 '20 at 17:44

score 151 · Answer 2 · edited Jan 22 '19 at 09:14

151

From v0.11+, ... use df.iloc.

In [7]: df.iloc[:,0]
Out[7]: 
0    1
1    2
2    3
3    4
Name: x, dtype: int64

edited Jan 22 '19 at 09:14

cs95

379,657
97
704
746

answered Mar 12 '13 at 14:49

Jeff

125,376
21
220
187

3

This is the most compatible version with the new releases and also with the old ones. And probably the most efficient since the dev team is officially promoting this approach. – gaborous Feb 15 '18 at 23:50

score 124 · Answer 3 · answered Mar 12 '13 at 12:42

124

You can get the first column as a Series by following code:

x[x.columns[0]]

answered Mar 12 '13 at 12:42

HYRY

94,853
25
187
187

how can i get the last column like that? – Polly Sep 22 '16 at 15:18
The others work fine as well, but this one seems more intuitive. – elPastor Dec 09 '16 at 01:20
7

This is no good if you have multiple columns with the same name. Whether column names should be unique or not is a separate discussion. – Vishal Oct 10 '17 at 09:13
@Polly `x[x.columns[x.columns.size-1]]` – fujianjin6471 Apr 21 '20 at 08:02

score 13 · Answer 4 · edited Jun 17 '17 at 17:40

13

Isn't this the simplest way?

By column name:

In [20]: df = pd.DataFrame({'x' : [1, 2, 3, 4], 'y' : [4, 5, 6, 7]})
In [21]: df
Out[21]:
    x   y
0   1   4
1   2   5
2   3   6
3   4   7

In [23]: df.x
Out[23]:
0    1
1    2
2    3
3    4
Name: x, dtype: int64

In [24]: type(df.x)
Out[24]:
pandas.core.series.Series

edited Jun 17 '17 at 17:40

ImportanceOfBeingErnest

321,279
53
665
712

answered Dec 23 '16 at 05:30

SamJ

147
1
2

11

In this particular case you know the name of the first column ("x"), but what the question meant was: "How can I access the first column, REGARDLESS of it's name". Also, accessing columns like this (`df.x`) is not generic -- what if the column name contains spaces? What if the name of the column coincides with `DataFrame`-s attribute name? It's more general to access columns using `__getitem__` (i.e. like so: `df["x"]`). – ponadto Mar 10 '17 at 06:58
3

Also doesn't work if the column's header has e.g. spaces in it. – Jean-François Corbett Jan 24 '18 at 12:54

score 4 · Answer 5 · answered Apr 07 '18 at 23:06

This works great when you want to load a series from a csv file

x = pd.read_csv('x.csv', index_col=False, names=['x'],header=None).iloc[:,0]
print(type(x))
print(x.head(10))


<class 'pandas.core.series.Series'>
0    110.96
1    119.40
2    135.89
3    152.32
4    192.91
5    177.20
6    181.16
7    177.30
8    200.13
9    235.41
Name: x, dtype: float64

score 4 · Answer 6 · answered Jul 07 '20 at 17:51

4

df[df.columns[i]]

where i is the position/number of the column(starting from 0).

So, i = 0 is for the first column.

You can also get the last column using i = -1

answered Jul 07 '20 at 17:51

BlackList96

176
2
9

How to get the first column of a pandas DataFrame as a Series?

6 Answers6

Linked