0

I have extracted few rows from a dataframe to a new dataframe. In this new dataframe old indices remain. However, when i want to specify range from this new dataframe i used it like new indices, starting from zero. Why did it work? Whenever I try to use the old indices it gives an error.

germany_cases = virus_df_2[virus_df_2['location'] == 'Germany']

germany_cases = germany_cases.iloc[:190]

This is the code. The rows that I extracted from the dataframe virus_df_2 have indices between 16100 and 16590. I wanted to take the first 190 rows. in the second line of code i used iloc[:190] and it worked. However, when i tried to use iloc[16100:16290] it gave an error. What could be the reason?

  • Does this answer your question? [How are iloc and loc different?](https://stackoverflow.com/questions/31593201/how-are-iloc-and-loc-different) – woblob Oct 18 '20 at 08:59

2 Answers2

0

In pandas there are two attributes, loc and iloc.

The iloc is, as you have noticed, an indexing based on the order of the rows in memory, so there you can reference the nth line using iloc[n].

In order to reference rows using the pandas indexing, which can be manually altered and can not only be integers but also strings or other objects that are hashable (have the __hash__ method defined), you should use loc attribute.

In your case, iloc raises an error because you are trying to access a range that is outside the region defined by your dataframe. You can try loc instead and it will be ok.

At first it will be hard to grasp the indexing notation, but it can be very helpful in some circumstances, like for example sorting or performing grouping operations.

Vasilis Lemonidis
  • 636
  • 1
  • 7
  • 25
0

Quick example that might help:

 df = pd.DataFrame(
    dict(
        France=[1, 2, 3],
        Germany=[4, 5, 6],
        UK=['x', 'y', 'z'],
    ))
df = df.loc[:,"Germany"].iloc[1:2]

Out:

1    5
Name: Germany, dtype: int64

Hope I could help.