-3

I have a DataFrame in pandas with following layout:

  title       date  value1
0   ABC   6/2/2018    1900
1   ABC   6/1/2018    1000
2   ABC  5/29/2018     405
3   ABC  3/18/2018     300
4   ABC  3/17/2018      50
5   LMO   6/1/2018     100
6   LMO  5/30/2018      10
7   LMO  5/29/2018       1

I want to create df2. It will only contain rows of titles with latest date . I am new python and pandas therefore have to ask for help.

df2:

  title      date  value1
0   ABC  6/2/2018    1900
1   LMO  6/1/2018     100
ALollz
  • 57,915
  • 7
  • 66
  • 89
  • 4
    Weclome to SO. Please take a moment to read: https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples to see how to make a good reproducible pandas example. Images aren't helpful as we can't copy and paste them into sample data. – ALollz Jul 13 '20 at 05:03

2 Answers2

1

Try:

df[df.groupby('title').date.transform('max') == df['date']]

df_new:

    title   date        value1
0   ABC     2018-06-02  1900
5   LMO     2018-06-01  100
Pygirl
  • 12,969
  • 5
  • 30
  • 43
0

First sort the data by date, group by title, and take first.

df.sort_values('date', ascending=False).groupby('title').first()  
Pramote Kuacharoen
  • 1,496
  • 1
  • 5
  • 6