3

Similar to one of my previous questions (Merge dataframes on nearest datetime / timestamp), I would like to merge two pandas data frames on two datetime columns using nearest match:

Let A and B be two dataframes as follows:

A = pd.DataFrame({"ID":["A", "A", "C" ,"B", "B"], "init_date":["01/01/2015","07/02/2014","08/02/1999","01/01/1991","06/22/2014"], "fin_date":["04/16/1923","09/24/1945","06/24/1952","11/26/1988","10/05/1990"]})

 In [15]: A
Out[15]: 
  ID    fin_date   init_date
0  A  04/16/1923  01/01/2015
1  A  09/24/1945  07/02/2014
2  C  06/24/1952  08/02/1999
3  B  11/26/1988  01/01/1991
4  B  10/05/1990  06/22/2014


B = pd.DataFrame({"ID":["A", "A", "C" ,"B", "B"], "date":["02/15/2015","06/30/2014","07/02/1999","10/05/1990","06/24/2014"],"fin_date":["12/10/1926","01/01/1944","08/21/1955","12/12/1987","11/05/1991"], "value": ["3","5","1","7","8"] })

 In [11]: B
Out[11]: 
  ID        date    fin_date value
0  A  02/15/2015  12/10/1926     3
1  A  06/30/2014  01/01/1944     5
2  C  07/02/1999  08/21/1955     1
3  B  10/05/1990  12/12/1987     7
4  B  06/24/2014  11/05/1991     8

The resulting data frame should look like the following:

In [21]: C
Out[21]: 
  ID    fin_date   init_date value
0  A  04/16/1923  01/01/2015     3
1  A  09/24/1945  07/02/2014     5
2  C  06/24/1952  08/02/1999     1
3  B  11/26/1988  01/01/1991     7
4  B  10/05/1990  06/22/2014     8

The general problem would potentially not have a close match for neither init_date nor fin_date, however, I would also be interested in a solution when there are exact matches for init_date, for example.

Note that one difficulty is that one match might be closer to the value in init_date than on final date, while a competing match might be the opposite. In that case, I prefer the one that is closer to init_date. To my knowledge, after attempting a similar approach as the one in the link, I get that reindexing by "nearest" is not implemented for multi-indexing.

Thank you and I appreciate your help,

Community
  • 1
  • 1
dleal
  • 2,244
  • 6
  • 27
  • 49
  • Possible duplicate of [pandas merge dataframes on closest timestamp](http://stackoverflow.com/questions/38807890/pandas-merge-dataframes-on-closest-timestamp) – Merlin Aug 29 '16 at 01:06
  • 1
    So, what did you learn from the last time you posted the question? What have you tried, SO is not a coding service. – Merlin Aug 29 '16 at 01:27

1 Answers1

0
pd.merge(A,B['value'],on=['ID','fin_date'],how='left')
Abul
  • 197
  • 2
  • 4
  • 15