Suppose I have a dataframe df
such as:
item A B
0 foo 3 4
1 bar 8 7
2 baz 1 2
I can set the index as follows:
new_df = df.set_index('item')
item A B
foo 3 4
bar 8 7
baz 1 2
However assuming 'item' will always be unique (in fact I want it to be unique to avoid errors during analysis), what are the benefits of replacing the default index with a column of my choosing?
Other than ensuring the indexed column contains unique values (which is important in my case), currently I can only see disadvantages of setting an index. For example I can no longer filter this (indexed) column using loc
, for example this doesn't work:
filtered_df = new_df.loc[new_df['name'] == 'foo']
I've never set indexes before in pandas. Do they have any benefits I'm missing such as speed benefits or special methods?