0

I'm relatively new to Python. I've been working in a project, and have a DataFrame that gives me two different instances for the same index. I exlpain: When I ask for an instance this way:

df[df.name == 'Marcia']

the result shows Int64Index([92], dtype='int64')

But if I ask for an instance by index = 92 this way:

df.iloc[92]

the result is a different instance, whose 'name' is not 'Marcia'. There is only one 'Marcia' in my dataset. How can this happen?

Vulpex
  • 1,041
  • 8
  • 19
  • df[df.name == 'Marcia'] should give a dataframe, if it passes, with name column. What is Int64Index([92], dtype='int64') then? – Evgeny Nov 15 '20 at 23:48
  • It gave a dataframe (the original but filtered for name = 'Marcia'), but then came the message indicating that the index of that observation was 92, as stated in the first index column. – Cecilia Gonzalez Nov 16 '20 at 00:37
  • perhaps you could show entire output, `Int64Index([92], dtype='int64')` still a bit unclear – Evgeny Nov 16 '20 at 21:21

1 Answers1

1

df.iloc[92] takes the 92th row in your dataframe. As your dataframe is shuffled, or some rows have been deleted during data wrangling steps, the row with index-name 92 may not be the 92th row anymore.

Try using df.loc[92] instead, as it returns the row with index-name 92.

See How are iloc and loc different? for more information.

Rik Kraan
  • 586
  • 2
  • 16