0

I have 2 data frames.

First,

abc <- data.frame(bin1 = c("0-25K", "25K-50K", "50K+"), group1 = c(1, 1, 2), bin2 = c("0-25", "25-50", "50+"), group2 = c(1, 2, 2))

pqr <- data.frame(bin1 = c("1_0-25K", "2_25K-50K", "3_50K+"),bin2 = c("0,25", "25,50", "50+"))

I want to merge abc and pqr to get

pqr <- data.frame(bin1 = c("1_0-25K", "2_25K-50K", "3_50K+"), group1 = c(1, 1, 2), bin2 = c("0,25", "25,50", "50+"), group2 = c(1, 2, 2))

I looked at a few older posted questions, but none of them have worked for me.

Merging two Data Frames using Fuzzy/Approximate String Matching in R

Bruce Wayne
  • 471
  • 5
  • 18
  • I'm not quite sure I grasp the logic of how the fuzzy matching should work. Can you be more specific about that? I want to make sure I understand what your underlying principles are so that whatever solution we help you with can be properly generalized. – Benjamin May 31 '19 at 18:58
  • Also: Is the idea to use the `fuzzyjoin` package? I actually just learned about it now! One thing I love about StackOverflow. – Benjamin May 31 '19 at 19:01
  • Try with `pqr %>% mutate(bin2 = str_replace(bin2, ",", "-")) %>% left_join(abc, by = 'bin2') %>% transmute(bin1 = bin1.x, bin2, group1, group2)` – akrun May 31 '19 at 19:21
  • This is just a dummy data, consider a solution which fits all the alternatives for fuzzy matching and merging. Be it -/, or more characters in one vector than another – Bruce Wayne May 31 '19 at 19:31
  • I am trying to create a data frame, where I join ob bins, to get the grouping for the same bins in the dataframe. That's the underlying motive – Bruce Wayne May 31 '19 at 19:32

1 Answers1

0

This works:

library(fuzzyjoin)
pqr <- pqr %>% stringdist_inner_join(abc, by = c(bin1 = "bin1"))
Bruce Wayne
  • 471
  • 5
  • 18