0

I have a CSV file. I load it in pandas dataframe. Now, I am practicing the loc method. This CSV file contains a list of James bond movies and I am passing letters in the loc method. I could not interpret the result shown.

bond = pd.read_csv("jamesbond.csv", index_col = "Film")
bond.sort_index(inplace = True)
bond.head(3)

bond.loc["A": "I"]

The result for the above code is:

A to I

bond.loc["a": "i"]

And the result for the above code is:

a to i

What is happening here? I could not understand. Please someone help me to understand the properties of pandas.

Following is the file:

Jamesbond File

Gopal Kisi
  • 77
  • 5

2 Answers2

2

Your dataframe uses the first column ("Film") as an index when it is imported (because of the option index_col = "Film"). The column contains the name of each film stored as a string, and they all start with a capital letter. bond.loc["A":"I"] returns all films where the index is greater than or equal to "A" and less than or equal to "I" (pandas slices are upper-bound inclusive), which by the rules of string comparison in Python includes all films beginning with "A"-"H", and would also include a film called "I" if there was one. If you enter e.g. "A" <= "b" <="I" in the python prompt you will see that lower-case letters are not within the range, because ord("b") > ord("I").

If you wrote bond.index = bond.index.str.lower() that would change the index to lower case and you could search films using e.g. bond["a":"i"] (but bond["A":"I"] would no longer return any films).

Stuart
  • 9,597
  • 1
  • 21
  • 30
0

DataFrame.loc["A":"I"] returns the rows that start with the letter in that range - from what I can see and tried to reproduce. Might you attach the data?

baggiponte
  • 547
  • 6
  • 13