[I have not found exact answer on that question in those "duplicates", that jezrael suggests.]
I have quite complicated dataframe. I need to group it by 3 columns and then take the most recent (by columns 'date') value in EACH group.
place customer type date sales
LA m a 2018-12-1 1
LA m b 2018-12-1 4
LA m a 2018-12-4 2
LA m b 2018-12-4 5
LA m a 2018-12-3 3
LA m b 2018-12-3 1
LA m a 2018-12-7 4
LA m b 2018-12-7 10
LA f a 2018-12-1 1
LA f b 2018-12-1 1
LA f a 2018-12-4 15
LA f b 2018-12-4 12
LA f a 2018-12-3 31
LA f b 2018-12-3 11
LA f a 2018-12-7 14
LA f b 2018-12-7 40
NY m a 2018-12-1 3
NY m b 2018-12-1 4
NY m a 2018-12-4 1
NY m b 2018-12-4 13
NY m a 2018-12-3 14
NY m b 2018-12-3 8
NY m a 2018-12-7 11
NY m b 2018-12-7 10
I need to group df by place, customer and type, and then in each subgroup select most recent date and corresponding sales:
place customer type date sales
LA m a 2018-12-7 4
b 2018-12-7 10
f a 2018-12-7 40
b 2018-12-7 14
NY m a 2018-12-7 11
b 2018-12-7 10
I tried
gr = df.groupby([df['place'], df['customer'], df['type']]).last()
but this gives the last row, which generally speaking is not always the most recent date (dates can be shuffled)