108

I've looking around for this but I can't seem to find it (though it must be extremely trivial).

The problem that I have is that I would like to retrieve the value of a column for the first and last entries of a data frame. But if I do:

df.ix[0]['date']

I get:

datetime.datetime(2011, 1, 10, 16, 0)

but if I do:

df[-1:]['date']

I get:

myIndex
13         2011-12-20 16:00:00
Name: mydate

with a different format. Ideally, I would like to be able to access the value of the last index of the data frame, but I can't find how.

I even tried to create a column (IndexCopy) with the values of the index and try:

df.ix[df.tail(1)['IndexCopy']]['mydate']

but this also yields a different format (since df.tail(1)['IndexCopy'] does not output a simple integer).

Any ideas?

DSM
  • 342,061
  • 65
  • 592
  • 494
elelias
  • 4,552
  • 5
  • 30
  • 45

7 Answers7

176

The former answer is now superseded by .iloc:

>>> df = pd.DataFrame({"date": range(10, 64, 8)})
>>> df.index += 17
>>> df
    date
17    10
18    18
19    26
20    34
21    42
22    50
23    58
>>> df["date"].iloc[0]
10
>>> df["date"].iloc[-1]
58

The shortest way I can think of uses .iget():

>>> df = pd.DataFrame({"date": range(10, 64, 8)})
>>> df.index += 17
>>> df
    date
17    10
18    18
19    26
20    34
21    42
22    50
23    58
>>> df['date'].iget(0)
10
>>> df['date'].iget(-1)
58

Alternatively:

>>> df['date'][df.index[0]]
10
>>> df['date'][df.index[-1]]
58

There's also .first_valid_index() and .last_valid_index(), but depending on whether or not you want to rule out NaNs they might not be what you want.

Remember that df.ix[0] doesn't give you the first, but the one indexed by 0. For example, in the above case, df.ix[0] would produce

>>> df.ix[0]
Traceback (most recent call last):
  File "<ipython-input-489-494245247e87>", line 1, in <module>
    df.ix[0]
[...]
KeyError: 0
DSM
  • 342,061
  • 65
  • 592
  • 494
  • thanks for your answer. However, I have another data frame in which df.ix[0] seems to give the first row of the data frame, even though the first index is not 0. In particular, the result of df.index[0] is not 0, and yet df.ix[df.index[0]] and df.ix[0] do give the same result. Why is that? – elelias Apr 07 '13 at 15:00
  • I'd need to see the index, but I suspect it's because the index is non-numerical, in which case accessing by integer *can* behave like it's an index, and not a key. This is because there's no ambiguity in what you're asking for if you ask for `Something(["A", "B", "C"])[1]`, but what do you want if you have `Something([1,2,3,4])[1]`? Read the various sections [here in the docs](http://pandas.pydata.org/pandas-docs/dev/gotchas.html#integer-indexing) on some of the headaches involved. – DSM Apr 07 '13 at 15:10
  • How to use df['xxx'][df.index[0]] for a float? I have a float 56.7888 and it's converted to 56 instead of 57 – lvthillo Sep 20 '18 at 18:24
  • 3
    Calling `iget()` gives `'Series' object has no attribute 'iget'`. – Suzana Oct 21 '19 at 10:43
  • Using pandas for 4 years, this is the first time i've seen iget in any place of stackoverflow,tutorials, codes etc. – Mehmet Burak Sayıcı Jan 13 '21 at 18:37
27

Combining @comte's answer and dmdip's answer in Get index of a row of a pandas dataframe as an integer

df.tail(1).index.item()

gives you the value of the index.


Note that indices are not always well defined not matter they are multi-indexed or single indexed. Modifying dataframes using indices might result in unexpected behavior. We will have an example with a multi-indexed case but note this is also true in a single-indexed case.

Say we have

df = pd.DataFrame({'x':[1,1,3,3], 'y':[3,3,5,5]}, index=[11,11,12,12]).stack()

11  x    1
    y    3
    x    1
    y    3
12  x    3
    y    5              # the index is (12, 'y')
    x    3
    y    5              # the index is also (12, 'y')

df.tail(1).index.item() # gives (12, 'y')

Trying to access the last element with the index df[12, "y"] yields

(12, y)    5
(12, y)    5
dtype: int64

If you attempt to modify the dataframe based on the index (12, y), you will modify two rows rather than one. Thus, even though we learned to access the value of last row's index, it might not be a good idea if you want to change the values of last row based on its index as there could be many that share the same index. You should use df.iloc[-1] to access last row in this case though.

Reference

https://pandas.pydata.org/pandas-docs/stable/generated/pandas.Index.item.html

Tai
  • 7,684
  • 3
  • 29
  • 49
8
df.tail(1).index 

seems the most readable

comte
  • 3,092
  • 5
  • 25
  • 41
  • This does not return a number but: RangeIndex(start=6, stop=7, step=1) – alexandergs Jun 07 '16 at 03:06
  • 5
    alex: from the returned `index`, the `start=6` indicates the offset of the last element. So, `df.tail(1)` gets the last element, `df["your_column"][6]` would be the last element, for `your_column`, etc (but `df.last_valid_index()` gives you just the number) – michael Nov 27 '17 at 08:04
4

It may be too late now, I use index method to retrieve last index of a DataFrame, then use [-1] to get the last values:

For example,

df = pd.DataFrame(np.zeros((4, 1)), columns=['A'])
print(f'df:\n{df}\n')

print(f'Index = {df.index}\n')
print(f'Last index = {df.index[-1]}')

The output is

df:
     A
0  0.0
1  0.0
2  0.0
3  0.0

Index = RangeIndex(start=0, stop=4, step=1)

Last index = 3
yoonghm
  • 4,198
  • 1
  • 32
  • 48
4

dataframe_object.index returns the list of all the index, to get any range of index you can use the list properties.

To get the last element index:

dataframe_object.index[-1]

To get the First element index:

dataframe_object.index[0]

To get the index of first x elements index:

dataframe_object.index[0:x]

To get the index of last x elements index:

dataframe_object.index[-3:]

Example

last_record_index = betweenMeals_df.index[-1]
Arpan Saini
  • 4,623
  • 1
  • 42
  • 50
2

You want .iloc with double brackets.

import pandas as pd
df = pd.DataFrame({"date": range(10, 64, 8), "not_date": "fools"})
df.index += 17
df.iloc[[0,-1]][['date']]

You give .iloc a list of indexes - specifically the first and last, [0, -1]. That returns a dataframe from which you ask for the 'date' column. ['date'] will give you a series (yuck), and [['date']] will give you a dataframe.

grofte
  • 1,839
  • 1
  • 16
  • 15
1

Pandas supports NumPy syntax which allows:

df[len(df) -1:].index[0]
Quantum
  • 190
  • 3
  • 16