I have a dataframe which looks like this (see table). For simplicity sake I've "aapl" is the only ticker
shown. However, the real dataframe has more tickers.
ticker | year | return |
---|---|---|
aapl | 1999 | 1 |
aapl | 2000 | 3 |
aapl | 2000 | 2 |
What I'd like to do is first group the dataframe by ticker
, then by year
. Next, I'd like to remove any duplicate years. In the end the dataframe should look like this:
ticker | year | return |
---|---|---|
aapl | 1999 | 1 |
aapl | 2000 | 3 |
I have a working solution, but it's not very "Pandas-esque", and involves for
loops. I'm semi-certain that if I come back to the solution in three months, it'll be completely foreign to me.
Right now, I've been working on the following, with little luck:
df = df.groupby('ticker').groupby('year').drop_duplicates(subset=['year'])
This however, produces the following error:
AttributeError: 'DataFrameGroupBy' object has no attribute 'groupby'
Any help here would be greatly appreciated, thanks.