0

When exploring indexing pandas.Series I encountered the following inconsistency: When I have a a pd.Series with an integer index, I can access a value with the bracket operator + index LABEL. When I pass an index label that does not exist, I get KeyError. This behaviour seems familiar from pandas.DataFrame when accessing a column df[x] with a the column name. One has to use .iloc to specify that the value in the bracket is the index POSITION. However, this behaviour does not occur when the index is a string. The bracket operator somehow does not give a error when the index label does not exist (here sr2[0]) and returns the integer position without specifying with .iloc. For me, this raises the question what exactly the bracket operator does in the first place.

import pandas as pd

# Create a Series with integer index
sr1 = pd.Series([10, 20, 30], index=[1, 2, 3], name = 'sr1')

# index label 1
sr1[1] # Output: 10

# index label 2
sr1[2] # Output: 20

# index integer position 0
sr1[0] # ERROR
sr1.iloc[0] # alternatively


# Create a Series with non-integer (string) index
sr2 = pd.Series([10, 20, 30], index=['a', 'b', 'c'], name='sr2')

# index label 'b'
sr2['b'] # Output: 20

# index integer position 0
sr2[0] # NO ERROR

Thanks a lot!

monkei
  • 28
  • 4

1 Answers1

0

The series[…] notation calls __getitem__, which has a quite complex multi-purpose behavior (see the source).

It checks for various types of keys passed as indexer (callable, booleans, MultiIndex, …), the type of Index, etc. and decides how to handle the indexer based on those many parameters. In your case this defaults to classical list-like slicing.

If you want a consistent label-based indexing, rather use loc:

sr2.loc[0]
# raises KeyError
mozway
  • 194,879
  • 13
  • 39
  • 75