0

I need to amend the dataframe "DF1" by matching its 1st (and only) column with 2nd column of "DF2" and printing the matched column by preserving the order of rownames in DF1. I also need to replace the non-matching rows with 0. These are two examples of the data frames I have:

"DF1"

Ccd
Kkl
Sop
Mnn
Msg
Xxy
Zxz
Ccd
Msg

"DF2"

3   Ab
5   Abc
5   Ccd
9   Kkl
5   Msg
13  Sop
19  Klj

Code

read.table("a.txt")->DF1
read.table("b.txt")->DF2
colnames(DF1)<-c("b")
colnames(DF2)<-c("a", "b")
DF3 <- merge(DF1,DF2, by="b", all.x=TRUE) #
DF3$a[is.na(DF3$a)] <- 0 #substitute NA with 0

Output I get from above code is:

 b  a
Ccd  5
Ccd  5
Kkl  9
Mnn  0
Msg  5
Msg  5
Sop 13
Xxy  0
Zxz  0

Output I actually need is:

Ccd  5
Kkl  9
Sop  13
Mnn  0
Msg  5
Xxy  0
Zxz  0
Ccd  5
Msg  5
Vinay
  • 467
  • 2
  • 7
  • 17

1 Answers1

1

With data.table, you can do this:

library(data.table)
setDT(df2)[setDT(df1),,on="b"][is.na(a), a:=0][]

Output:

    a   b
1:  5 Ccd
2:  9 Kkl
3: 13 Sop
4:  0 Mnn
5:  5 Msg
6:  0 Xxy
7:  0 Zxz
8:  5 Ccd
9:  5 Msg

Or with dplyr:

library(dplyr)
left_join(df1,df2, by="b") %>% mutate(a=if_else(is.na(a),0,as.double(a)))

Output:

     b  a
1: Ccd  5
2: Kkl  9
3: Sop 13
4: Mnn  0
5: Msg  5
6: Xxy  0
7: Zxz  0
8: Ccd  5
9: Msg  5

Input:

df1 <- structure(list(b = c("Ccd", "Kkl", "Sop", "Mnn", "Msg", "Xxy", 
"Zxz", "Ccd", "Msg")), row.names = c(NA, -9L), class = "data.frame")

df2 <- structure(list(a = c(3L, 5L, 5L, 9L, 5L, 13L, 19L), b = c("Ab", 
"Abc", "Ccd", "Kkl", "Msg", "Sop", "Klj")), row.names = c(NA, 
-7L), class = "data.frame")
langtang
  • 22,248
  • 1
  • 12
  • 27