R: fuzzy merge two data frame

Question

I have 2 data frames.

First,

abc <- data.frame(bin1 = c("0-25K", "25K-50K", "50K+"), group1 = c(1, 1, 2), bin2 = c("0-25", "25-50", "50+"), group2 = c(1, 2, 2))

pqr <- data.frame(bin1 = c("1_0-25K", "2_25K-50K", "3_50K+"),bin2 = c("0,25", "25,50", "50+"))

I want to merge abc and pqr to get

pqr <- data.frame(bin1 = c("1_0-25K", "2_25K-50K", "3_50K+"), group1 = c(1, 1, 2), bin2 = c("0,25", "25,50", "50+"), group2 = c(1, 2, 2))

I looked at a few older posted questions, but none of them have worked for me.

Merging two Data Frames using Fuzzy/Approximate String Matching in R

I'm not quite sure I grasp the logic of how the fuzzy matching should work. Can you be more specific about that? I want to make sure I understand what your underlying principles are so that whatever solution we help you with can be properly generalized. — Benjamin, May 31 '19 at 18:58
Also: Is the idea to use the `fuzzyjoin` package? I actually just learned about it now! One thing I love about StackOverflow. — Benjamin, May 31 '19 at 19:01
Try with `pqr %>% mutate(bin2 = str_replace(bin2, ",", "-")) %>% left_join(abc, by = 'bin2') %>% transmute(bin1 = bin1.x, bin2, group1, group2)` — akrun, May 31 '19 at 19:21
This is just a dummy data, consider a solution which fits all the alternatives for fuzzy matching and merging. Be it -/, or more characters in one vector than another — Bruce Wayne, May 31 '19 at 19:31
I am trying to create a data frame, where I join ob bins, to get the grouping for the same bins in the dataframe. That's the underlying motive — Bruce Wayne, May 31 '19 at 19:32

score 0 · Accepted Answer · answered May 31 '19 at 19:53

0

This works:

library(fuzzyjoin)
pqr <- pqr %>% stringdist_inner_join(abc, by = c(bin1 = "bin1"))

answered May 31 '19 at 19:53

Bruce Wayne

471
5
18

put it inside for loop and it works for all columns you'd like – Bruce Wayne May 31 '19 at 19:54

R: fuzzy merge two data frame

1 Answers1

Linked