0

In a dataframe as such named pd:

country 1980 1990 2000
India. 800 2000 3500
China 200 2000 1500
UK. 160 150 400

How can we find the rows where the values of year 2000 are greater than 1000?

I see there are two ways:

pd.loc[pd['2000'] > 1000] 

and

pd[pd['2000'] > 1000]

is there a difference in the two methods? I see it produces the same results but don't understand if there is a difference.

Thanks

1 Answers1

0

The first case is generally better because the second (indexing the dataframe instead of using df.loc) sometimes returns a copy of data and sometimes returns a view depending on how it's used. The difference is explained here: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html?highlight=assigning%20view#indexing-view-versus-copy

I don't think it matters unless you're assigning to the selection, but as a rule of thumb it's best to use a series of selection criteria in df.loc[], for example pd.loc[(pd['2000'] > 1000), 'columnName']

shortorian
  • 1,082
  • 1
  • 10
  • 19