1

I have two dataframes in R: df1 and df2 as follows-

**df1**
Cust_id Cust_name   Cust_dob    Cust_address
1         Andrew    10/11/1990  New York
2         Dillain   01/02/1970  San Francisco
3         Alma      07/11/1985  Miami
4         Wesney    21/10/1979  New York
5         Kiko      10/12/1994  Miami

**df2**
Cust_address    Latitude    Longitude
New York        40.7128     74.0060
San Francisco   37.7749     122.4194
Miami           25.7617     80.1918
Texas           31.9686     99.9018
Dallas          32.7767     96.7970

I want to join these datasets together so that I get the following result: The latitude and longitude columns from df2 must match the address column of df1

**df3**
Cust_id Cust_name   Cust_dob    Cust_address    Latitude    Longitude
1       Andrew      10/11/1990   New York       40.7128    74.0060
2       Dillain     01/02/1970   San Francisco  37.7749    122.4194
3       Alma        07/11/1985   Miami          25.7617    80.1918
4       Wesney      21/10/1979   New York       40.7128    74.0060
5       Kiko        10/12/1994   Miami          25.7617    80.1918

I have tried using joins but cannot get the result that I want. I would really appreciate if someone could help me please. I am new to R. Thank you very much. I have tried in the following ways:

df3 = merge(x=df1,y=df2,by="Cust_address",all=TRUE)
Gregor Thomas
  • 136,190
  • 20
  • 167
  • 294
  • What do you get when you run your current code? Is it an error of some sort, or does the match not happen properly? – akshaymoorthy Aug 22 '22 at 14:59
  • @akshaymoorthy- the matching is not happening as expected –  Aug 22 '22 at 15:00
  • 1
    `merge` should work just fine. My best guess is you have some issue with column classes: please make your example more reproducible by sharing your data using `dput`, that is `dput(df1[1:5, ])`, etc. `dput` will include all class and structure information and be copy/pasteable. – Gregor Thomas Aug 22 '22 at 15:04
  • @akrun thank you very much. Yes it had spaces after the data. Its working now –  Aug 22 '22 at 15:09

1 Answers1

1

We could use inner_join()

inner_join(): includes all rows in x and y.

library(dplyr)

df3 <- inner_join(df1, df2, by="Cust_address")

 Cust_id Cust_name   Cust_dob  Cust_address Latitude Longitude
1       1    Andrew 10/11/1990      New York  40.7128   74.0060
2       2   Dillain 01/02/1970 San Francisco  37.7749  122.4194
3       3      Alma 07/11/1985         Miami  25.7617   80.1918
4       4    Wesney 21/10/1979      New York  40.7128   74.0060
5       5      Kiko 10/12/1994         Miami  25.7617   80.1918
TarJae
  • 72,363
  • 6
  • 19
  • 66
  • 1
    Thank you very much! It is working. and my data had spaces so i trimmed those. –  Aug 22 '22 at 15:11