1

Hi i am using nested for loops to find compatible blood types in two data sets. My datasets :

#IDR= c(seq(1,4))
#BTR=c("A","B","AB","O")
#data_R=data.frame(IDR,BTR,stringsAsFactors=FALSE)

#IDD= c(seq(1,8))
#BTD= c(rep("A", each=2),rep("B", each=2),rep("AB", each=2),rep("O", each=2))
#WD= c(rep(0.25, each=2),rep(0.125, each=2),rep(0.125, each=2),rep(0.5, each=2))
#data_D=data.frame(IDD,BTD,WD,stringsAsFactors=FALSE)

# data_R

  IDR BTR
1   1   A
2   2   B
3   3  AB
4   4   O

# data_D

  IDD BTD    WD
1   1   A 0.250
2   2   A 0.250
3   3   B 0.125
4   4   B 0.125
5   5  AB 0.125
6   6  AB 0.125
7   7   O 0.500
8   8   O 0.500

What i am trying to do is to verify for each row from data_R that i have a compatible blood type in data_D,For instance: if i have BTR=AB then i would like to print all the values of WD in data_D (because AB is compatible with A,B,AB and O), if i have BTR=A then i would like to print the values of WD in data_D that corresponds to only A and O, if i have BTR=B then i would like to print the values of WD in data_D that corresponds to only B and O, And finally if i have BTR=O then i would like to print only the values of WD in data_D that corresponds to O.

here is the code i wrote but the output does not have the results i wanted

for (i in 1:nrow(data_R)) {
  for (j in 1:nrow(data_D)) {
    if(BTR[i] =="AB"){
      if(BTD[j]=="A" || BTD[j]=="B" || BTD[j]=="AB" || BTD[j]=="O"){
        output=as.vector(WD)
      }
    }else if(BTR[i] =="A"){
      if(BTD[j]=="A" || BTD[j]=="O"){
        output=as.vector(WD)
      }
    }else if(BTR[i] =="B"){
      if(BTD[j]=="B" || BTD[j]=="O"){
        output=as.vector(WD)
      }
      
    }else if(BTR[i] =="O"){
      if(BTD[j] =="O"){
        output=as.vector(WD)
      }
      
    }
    
  }
  
}
output

and that is the output i got: [1] 0.250 0.250 0.250 0.250 0.125 0.125 0.500 0.500

I was only able to get the output (WRONG), and i would appreciate any help to fix this problem and display more readable (taking info from both datasets) output like :

   BTR BTD output
1    A   A  0.250
2    A   A  0.250
3    A   O  0.500
4    A   O  0.500
5    B   B  0.125
6    B   B  0.125
7    B   O  0.500
8    B   O  0.500
9   AB   A  0.250
10  AB   A  0.250
11  AB   B  0.125
12  AB   B  0.125
13  AB  AB  0.125
14  AB  AB  0.125
15  AB   O  0.500
16  AB   O  0.500
17   O   O  0.500
18   O   O  0.500

I apologize in advance if my question is long i just wanna make sure i explained it well. Thank you in advance for your help.

Janet
  • 225
  • 1
  • 6

1 Answers1

1

You just need two merges and an intermediate data.frame:

compatible <- data.frame(
  BTR = c(rep("AB", 4),     rep("A", 2), rep("B", 2), "O"),
  BTD = c("AB","A","B","O", "A","O",     "B","O",     "O")
)
compatible
#   BTR BTD
# 1  AB  AB
# 2  AB   A
# 3  AB   B
# 4  AB   O
# 5   A   A
# 6   A   O
# 7   B   B
# 8   B   O
# 9   O   O

The first step provides all possible donors for each recipient:

tmp <- merge(data_R, compatible, by = "BTR", all.x = TRUE, sort = FALSE)
tmp
#   BTR IDR BTD
# 1   A   1   A
# 2   A   1   O
# 3   B   2   B
# 4   B   2   O
# 5  AB   3  AB
# 6  AB   3   A
# 7  AB   3   B
# 8  AB   3   O
# 9   O   4   O

The second merge brings in the available donors:

merge(tmp, data_D, by = "BTD")
#    BTD BTR IDR IDD    WD
# 1    A   A   1   1 0.250
# 2    A   A   1   2 0.250
# 3    A  AB   3   1 0.250
# 4    A  AB   3   2 0.250
# 5   AB  AB   3   5 0.125
# 6   AB  AB   3   6 0.125
# 7    B   B   2   3 0.125
# 8    B   B   2   4 0.125
# 9    B  AB   3   3 0.125
# 10   B  AB   3   4 0.125
# 11   O   B   2   7 0.500
# 12   O   B   2   8 0.500
# 13   O   O   4   7 0.500
# 14   O   O   4   8 0.500
# 15   O   A   1   7 0.500
# 16   O   A   1   8 0.500
# 17   O  AB   3   7 0.500
# 18   O  AB   3   8 0.500

Note that the order is different but your expected output is in there.

While this is using base R, other packages provide some more control over merging. I suggest you look at How to join (merge) data frames (inner, outer, left, right) and https://stackoverflow.com/a/6188334/3358272 to learn about joins (they are a very powerful data-manipulation mechanism!), and consider dplyr or data.table for facilitating this flow:

library(dplyr)
left_join(data_R, compatible, by = "BTR") %>%
  left_join(data_D, by = "BTD")

library(data.table)
data_RDT <- as.data.table(data_R)
data_DDT <- as.data.table(data_D)
compatible <- as.data.table(compatible)
compatible[data_RDT, on = .(BTR)][data_DDT, on = .(BTD), allow.cartesian = TRUE]
r2evans
  • 141,215
  • 6
  • 77
  • 149
  • 1
    Thank you very much @r2evans! That was so clean and **Efficace**. I really like the `left_join` function you used, it is going to help me in a lot of other matching datasets. – Janet Sep 28 '20 at 06:32
  • I find the *concept* of merge/join to be both foreign at first, and then extremely powerful when you start to understand it and are able to capitalize on it. And then it becomes natural and you wonder how you survived without it. – r2evans Sep 28 '20 at 06:35