0
import numpy as np
import pandas as pd
df = pd.DataFrame(np.random.randn(4, 2), columns=['c1', 'c2'], index=list('abcd'))
# valid slice label
df.loc['a':'d']
# invalid syntax
labels = 'a':'d'; df.loc[labels]

  File "<stdin>", line 7
    labels = 'a':'d'; df.loc[labels]
                ^
SyntaxError: invalid syntax

What's the exact type for wrapping slice labels,like 'predicate' or 'criterion'?

user950851
  • 77
  • 4
  • It is basically slice notation as one would use for a list, e.g `my_list[5:10]`. See more here: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html – Alexander Nov 18 '19 at 03:44
  • I think if you are selecting rows using indexes you should use pd.Series.iloc not loc, loc selects mainly columns names,it can receive an integer(i.e 5 ) and gets the 5th column, string as a label of a column, and list of string as multiple columns to select, whereas iloc can take indexes and in the form of slice pd.Series.iloc[1:5] so select from rwo 1 to 4 th row of the dataframe https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.iloc.html#pandas.DataFrame.iloc – Eliethesaiyan Nov 18 '19 at 03:46
  • @Eliethesaiyan Be careful, you seem to be confusing `loc` and `iloc`. To keep things short, `loc` is selection using labels or boolean arrays, and `iloc` is selection using integer position(s). Both are designed for accessing rows and columns, and with a single argument both select rows in a DataFrame, or elements in a Series. Finally, `.loc[5]` is not at all equivalent to `.iloc[5]`. – AMC Nov 18 '19 at 04:35
  • Why are cases 2 and 3 not arguments to `.loc[]`? It makes understanding the idea of “invalid syntax” more confusing, which is worsened by the lack of example error message(s). What do you mean “overload”? – AMC Nov 18 '19 at 04:38
  • 1
    BTW, case 3 just returns a dictionary with one key/value pair. – Alexander Nov 18 '19 at 04:53
  • @Alexander Thanks for your hints. I should have made my query more clearly as follows. – user950851 Nov 18 '19 at 05:51

1 Answers1

1

The query is essentially about the internal implementation of pandas's label slicing (which is still black-box for me as I did not dig into the source code). The full correct code should be:

import numpy as np
import pandas as pd
df = pd.DataFrame(np.random.randn(4, 2), columns=['c1', 'c2'], index=list('abcd'))
# valid slice label
df.loc['a':'d']
# invalid syntax
# labels = 'a':'d'; df.loc[labels]
# valid syntax
labels = pd.IndexSlice['a':'d']; df.loc[labels]

Also there is good illustration of the slicing of pandas: Pandas how does IndexSlice work

Thanks for all comments.

user950851
  • 77
  • 4