0

I want to merge 2 dataframes where each have same column name Date_Time but different row counts. I want to merge these depend upon Date_Time and here I want all the rows should include if the column value(Date_Time) is equal i've to get oriented values(cmpny_name,Price) else i've to get NA. Ex Dateset is, df1:

Cmny_Name     Date_Time      Price 
A          1/1/2015 13:27    1083
B          1/1/2015 13:28    1084
C          1/1/2015 13:29    1053
D          1/1/2015 13:31    1063
E          1/1/2015 13:33    1033

df2:

Cmny_Name1     Date_Time      Price 
A          1/1/2015 13:27    1043
A          1/1/2015 13:28    1053
A          1/1/2015 13:29    1054
A          1/1/2015 13:35    1084

My Expected Ouput(df3) is

Date_Time      Cmny_Name    Price   Cmny_Name1     Price   
1/1/2015 13:27   A          1083        A          1043
1/1/2015 13:28   B          1084        A          1053
1/1/2015 13:29   C          1053        A          1054
1/1/2015 13:31   D          1063        A           NA
1/1/2015 13:33   E          1033        A           NA
1/1/2015 13:35   NA         NA          A          1084
neilfws
  • 32,751
  • 5
  • 50
  • 63

2 Answers2

0
df3 <- merge(df1, df2, by = "Date_Time", all = TRUE)

or a dplyr way:

library(dplyr)
df3 <- df1 %>% 
  full_join(df2, by = "Date_Time")
neilfws
  • 32,751
  • 5
  • 50
  • 63
  • Thank u neilfws . . Actually I tried this df3 <- merge(df1, df2, by = "Date_Time", all = TRUE) . It is working fine with small size data which i posted here. But In my case am using 4484683 rows in df1 and 92768 rows in df2 . . The problem am getting here is, NA is printed in df2 values if df1 value is printed. and NA is printed in df1 values if df2 value is appear. – Yuvan Ramez Apr 27 '17 at 05:38
  • Thank u neilfws . . Actually I tried this df3 <- merge(df1, df2, by = "Date_Time", all = TRUE) . It is working fine with small size data which i posted here. But In my case am using 4484683 rows in df1 and 92768 rows in df2 . . The problem am getting here is, NA is printed in df2 values if df1 value is printed. and NA is printed in df1 values if df2 value is appear.Date_Time Cmny_Name Price Cmny_Name1 Price 1/1/2015 13:27 A 1083 NA NA 1/1/2015 13:35 NA NA A 1084 – Yuvan Ramez Apr 27 '17 at 05:49
0

You can merge using the base function merge:

## Your data:
df1 <- structure(list(Cmny_Name = c("A", "B", "C", "D", "E"), 
                Date_Time = c("1/1/2015 13:27", "1/1/2015 13:28", 
                        "1/1/2015 13:29", "1/1/2015 13:31", 
                        "1/1/2015 13:33"), 
                Price = c(1083L, 1084L, 1053L, 1063L, 1033L)), 
        .Names = c("Cmny_Name", "Date_Time", "Price"), 
        class = "data.frame", row.names = c(NA, -5L))
df2 <- structure(list(Cmny_Name1 = c("A", "A", "A", "A"), 
                Date_Time = c("1/1/2015 13:27", "1/1/2015 13:28", 
                        "1/1/2015 13:29", "1/1/2015 13:35"), 
                Price = c(1043L, 1053L, 1054L, 1084L)), 
        .Names = c("Cmny_Name1", "Date_Time", "Price"), 
        class = "data.frame", row.names = c(NA, -4L))

## Merging the two data frames:
df3 <- merge(x = df1, y = df2, by = "Date_Time", all = TRUE)

df3
#        Date_Time Cmny_Name Price.x Cmny_Name1 Price.y
# 1 1/1/2015 13:27         A    1083          A    1043
# 2 1/1/2015 13:28         B    1084          A    1053
# 3 1/1/2015 13:29         C    1053          A    1054
# 4 1/1/2015 13:31         D    1063       <NA>      NA
# 5 1/1/2015 13:33         E    1033       <NA>      NA
# 6 1/1/2015 13:35      <NA>      NA          A    1084

Or if you want you could e.g. use the join-functions from the dplyr package:

df3 <- dplyr::full_join(df1, df2, by = "Date_Time")

df3
#   Cmny_Name      Date_Time Price.x Cmny_Name1 Price.y
# 1         A 1/1/2015 13:27    1083          A    1043
# 2         B 1/1/2015 13:28    1084          A    1053
# 3         C 1/1/2015 13:29    1053          A    1054
# 4         D 1/1/2015 13:31    1063       <NA>      NA
# 5         E 1/1/2015 13:33    1033       <NA>      NA
# 6      <NA> 1/1/2015 13:35      NA          A    1084

Note: As the resulting data frames need to have unique column names, the columns Price are renamed Pice.x and Price.y during the merge.

ikop
  • 1,760
  • 1
  • 12
  • 24
  • Thank u ikop . . Actually I tried this df3 <- merge(df1, df2, by = "Date_Time", all = TRUE) . It is working fine with small size data which i posted here. But In my case am using 4484683 rows in df1 and 92768 rows in df2 . . The problem am getting here is, NA is printed in df2 values if df1 value is printed. and NA is printed in df1 values if df2 value is appear.Date_Time Cmny_Name Price Cmny_Name1 Price 1/1/2015 13:27 A 1083 NA NA 1/1/2015 13:35 NA NA A 1084 – Yuvan Ramez Apr 27 '17 at 05:50