1

What is the difference in using loc[x,y] vs. loc[x][y] vs. loc[[x]][y]? They seem quite similar at first glance.

df = pd.DataFrame(np.arange(6).reshape(3, 2),
                  columns=['price', 'count'],
                  index=['First', 'Second', 'Third'])
print(df)
#         price  count
# First       0      1
# Second      2      3
# Third       4      5

print(df.loc['Second', 'count'])
# 3

print(df.loc['Second']['count'])
# 3

print(df.loc[['Second'], 'count'])
# Second    3
Konstantin
  • 2,937
  • 10
  • 41
  • 58

1 Answers1

2

Although the first 2 are equivalent in output, the second is called chained indexing:

http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy

the type also is a Series for the second one:

In[48]:
type(df.loc['Second'])

Out[48]: pandas.core.series.Series

you then index the index value which then returns the scalar value:

In[47]:
df.loc['Second']

Out[47]: 
price    2
count    3
Name: Second, dtype: int32

In[49]:
df.loc['Second']['count']

Out[49]: 3

Regarding the last one, the additional brackets returns a df which is why you see the index value rather than a scalar value:

In[44]:
type(df.loc[['Second']])

Out[44]: pandas.core.frame.DataFrame

So then passing the column, indexes this df and returns the matching column, as a Series:

In[46]:
type(df.loc[['Second'],'count'])

Out[46]: pandas.core.series.Series

So it depends on what you want to achieve, but avoid the second form as it can lead to unexpected behaviour when attempting to assign to the column or df

EdChum
  • 376,765
  • 198
  • 813
  • 562
  • Thank you very much for the excellent explanation! In the link you provided some operation are with `loc` and some are without it. Should we consider `dfmi['one']` to be any different from `dfmi.loc['one']`? – Konstantin May 24 '18 at 10:07
  • 1
    @Konstantin it depends on what you do next, if you did `dfmi['one']['Second']` then this would be chained indexing and it may return a copy or a view. If you just want to look at the data then either approach is fine, it's assignment is where problems can occur. In which case follow the recommendations in that link – EdChum May 24 '18 at 10:10