1

I have a question and want to get an understanding of it, so the question is in pandas we have two indexers (loc, iloc) that we can use to select specific columns and rows, and the loc is labeled based on data selecting method which means that we have to pass the name of the row or column which we want to select, and iloc is an indexed based selecting method which means that we have to pass integer index in the method to select a specific row/column, so if I have the below data frame with columns names as a "string"

enter image description here

I can use the loc as below:

enter image description here

so if I rename the column to a number and the type of name of the column becomes 'integer' as below:

enter image description here

enter image description here

and using loc if I provide the name of the column that types integer why does this work without errors:

enter image description here

I mean if loc for labeled data why it works on column names as numbers 0, 1, 2, 3, with type "INTEGER"?

  • Please don’t post images of code, data or Tracebacks. Copy and paste it as text then format it as code (select it and type `ctrl-k`). [How to make good reproducible pandas examples](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples). – wwii Feb 05 '23 at 00:13
  • 1
    Because that's what the docs say? Can a label be an integer? What if the *last* column name was 0, does it work? – wwii Feb 05 '23 at 00:15

1 Answers1

3

You misunderstand the definition of label. That doesn't mean it's a string but rather how it's displayed on the screen. The problem is that we often (always) use names for columns and numbers for rows.

sr = pd.Series([7, 12, 49], index=[11, 12, 13])
print(sr)

# Output
11     7
12    12
13    49
dtype: int64

The first index of this series is 14 if you use .loc (by label) or 0 if you use .iloc (by position) but unfortunately the automatic RangeIndex of Pandas start to 0 so using .loc and .iloc looks like the same (but not!)

df = pd.DataFrame(np.arange(1, 10).reshape(3, 3), index=list('XYZ'), columns=range(101, 104))
print(df)

   101  102  103
X    1    2    3
Y    4    5    6
Z    7    8    9

To get the value 5, you can use:

# Remember, this is what you see to the screen (this is the label)
>>> df.loc['Y', 102]
5

# I have to think to convert what you see to coordinates (this is the position)
>>> df.iloc[1, 1]
5
Corralien
  • 109,409
  • 8
  • 28
  • 52
  • 1
    Thanks, a lot, because this question was on my mind for 1 year ago when I asked my lecturer and I wouldn't to forget this question because I always try to understand even the simplest details, but I was fearful to ask questions on StackOverflow and get banned or minus votes, so thank you so much again. – Mohammad Al Jadallah Feb 05 '23 at 13:33