0

I have two data frames

df1:

    ID      Date        Value
0   9560    07/3/2021   25
1   9560    03/03/2021  20
2   9712    12/15/2021  15
3   9712    08/30/2021  10
4   9920    4/11/2021   5

df2:

    ID     Value
0   9560        
1   9712      
2   9920   

In df2, I want to get the latest value from "Value" column of df1 with respect to ID. This is my expected output:

    ID       Value
0   9560     25   
1   9712     15
2   9920     5

How could I achieve it?

Sriram
  • 169
  • 1
  • 9
  • 1
    Does this answer your question? [Pandas Merging 101](https://stackoverflow.com/questions/53645882/pandas-merging-101) – sushanth Oct 29 '21 at 11:28
  • Try this; `pd.merge(df2, df1[['id', 'value']], on=['id'])` – Daniel Afriyie Oct 29 '21 at 11:41
  • @DanielAfriyie The snippet you mentioned merges all values where I need only the latest value so that I have unique values in ID. Thank You Sir! – Sriram Oct 29 '21 at 12:13

1 Answers1

1

Based on Daniel Afriyie's approach, I came up with this solution:

import pandas as pd


# Setup for demo
df1 = pd.DataFrame(
        columns=['ID', 'Date', 'Value'], 
        data=[
            [9560, '07/3/2021', 25], 
            [9560, '03/03/2021', 20],
            [9712, '12/15/2021', 15],
            [9712, '08/30/2021', 10],
            [9920, '4/11/2021', 5]
        ]
    )

df2 = pd.DataFrame(
        columns=['ID', 'Value'], 
        data=[[9560, None], [9712, None], [9920, None]]
    )


## Actual solution

# Casting 'Date' column to actual dates
df1['Date'] = pd.to_datetime(df1['Date'])
# Sorting by dates
df1 = df1.sort_values(by='Date', ascending=False)
# Dropping duplicates of 'ID' (since it's ordered by date, only the newest of each ID will be kept)
df1 = df1.drop_duplicates(subset=['ID'])


# Merging the values from df1 into the the df2
pf2 = pd.merge(df2[['ID']], df1[['ID', 'Value']]))

output:

     ID  Value
0  9560     25
1  9712     15
2  9920      5
white
  • 601
  • 3
  • 8