2

I'm trying to merge two data frames based on multiple columns

DF1:

   PERIOD MINUTES SECONDS  SHOT_DIST

0       1      11      00        8.2
1       1       4      41        3.2
2       1       1      41       26.6
3       1       0      01       24.7
4       2       3      02       25.1
5       2       0      29        6.5
6       2       0      03       24.5
7       3      11      23       24.4

and DF 2

      PERIOD  MINUTES_REMAINING  SECONDS_REMAINING  SHOT_DISTANCE
1452       1                 11                  0              3
1453       1                  4                 39              1
1454       1                  1                 39             27
1455       1                  0                  0             25
1456       2                  3                  0             25
1457       2                  0                 27              8
1458       2                  0                  1             24
1459       3                 11                 21             25

I want to merge these two dataframes using the first three columns. The problem is that the Seconds columns are not exact matches. But, they all seem to fall within a range of one another.

I cannot just sort and merge by index because the two dataframes are not of the same length (there is missing data in one).

Is there a way to merge these dataframes where the first two columns must be an exact match and the third column needs to meet some minimum threshold (e.g., diff <=2)?

Bradley
  • 2,057
  • 3
  • 14
  • 17
  • See Andy Hayden's and Luke's methods for using difflib.get_close_matches to implement fuzzy match merging with pandas at http://stackoverflow.com/questions/13636848/is-it-possible-to-do-fuzzy-match-merge-with-python-pandas –  Aug 24 '15 at 04:05
  • This sort of worked. I was hoping to avoid some extra steps, but in the end, I got the answer I needed. Ultimately, to follow this method, I had to combine the columns I wanted to merge on to create a unique string variable for each observation. Then, using the instructions in the post you linked, I was able to match a subset of the data where there was an exact 1-1 match for each record. – Bradley Aug 24 '15 at 14:09

0 Answers0