0

I have a dataframe with many rows that have to be matched by various conditions. So far I am struggling with that and slowly losing hope that I'll ever be able to solve it. Here is an example of my dataframe:

  No Test  Reference Value
  5 OF    3            55
  7 EPM   1            33
  5 H1    -            23
  7 H3    -            22
  5 R1    -            15
  7 R3    -            28
  5 H3    -            60
  7 H1    -            33
  5 R3    -            21
  7 R1    -            18
  5 T     -            20
  7 T     -            17

My goal is to get the dataframe with "Test" (OF or EPM) values matched with other values (from different rows) in new columns (H, R, and T). Matched data have to be selected based on the condition in "No" and "Reference" columns. Values (1 or 3) of "Reference" column refer to row ("Test" column: 1 = H1 and R1, 3 = H3 and R3). Row with T in the "Test" column is matched only based on the value in "No" column. Here is an example result:

  No Test  Reference Value     H     R     T
  5 OF            3    55    60    21    20
  7 EPM           1    33    33    18    17

I appreciated any help you can provide. Thank you so much!!!

Jena Tuma
  • 13
  • 4
  • 1
    Hi Jena. If you add a [minimal reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example/5963610#5963610) you could make it easier for others to find and test an answer to your question. That way you can help others to help you! – dario Jul 26 '20 at 07:51

1 Answers1

0

I have recreated a scenario trying to emulate the issue you are facing to the best of my understanding of your situation. Hopefully, it should at least put you on the path to the answer you are looking for.

You can copy paste the below code in your R console to go through all the steps.

library(dplyr)

## Reproduce the dataframe shown in your example
df <- cbind.data.frame(No = c(5,7,5,7,5,7,5,7,5,7),Test = c('H1','H3','R1','R3','H3','H1','R3','R1','T','T'),Value = c(23,22,15,28,60,33,21,18,20,17))

## Create the final dataframe that will contain all the sorted values
df2 <- cbind.data.frame(No = c(5,7), Test = c('OF','EPM'), Reference = c(3,1), Value = c(55,33), H = c(0,0), R = c(0,0), T = c(0,0) )

## Filling the final table <-- ANSWER TO YOUR QUESTION

for (row in 1:nrow(df))
{
  if(df[row,"No"] == 5){
    if(grepl("3", df[row,"Test"]) |grepl("T", df[row,"Test"]) ){
      df2[1,substr(df[row,"Test"],1,1)] = df[row,"Value"]
    }
  }
  else if(df[row,"No"] == 7){
    if(grepl("1", df[row,"Test"])|grepl("T", df[row,"Test"]) ){
      df2[2,substr(df[row,"Test"],1,1)] = df[row,"Value"]
    }
  }
}

# We now have a new table fully filled according to the logic specified in your example.
babkouew
  • 65
  • 5
  • Thank you very much for your answer!!! The problem is that my original data have more than 500 "No" subjects. Each "Test" can be matched with a variable value from "Reference" column. Nevertheless, you gave me some new idea how to deal with. Once more thank you. – Jena Tuma Jul 29 '20 at 05:56