3

I have a dataset with which I want to conduct a multilevel analysis. Therefore I have two rows for every patient, and a couple column with 1's and 2's (1 = patient, 2 = partner of patient).

Now, I have variables with date of birth and age, for both patient and partner in different columns that are now on the same row. What I want to do is to write a code that does:

if mydata$couple == 2, then replace mydata$dateofbirthpatient with mydata$dateofbirthpatient

And that for every row. Since I have multiple variables that I want to replace, it would be lovely if I could get this in a loop and just 'add' variables that I want to replace.

What I tried so far:

 mydf_longer <- if (mydf_long$couple == 2) {
  mydf_long$pgebdat <- mydf_long$prgebdat
 } 

Ofcourse this wasn't working - but simply stated this is what I want.

And I started with this code, following the example in By row, replace values equal to value in specified column , but don't know how to finish:

mydf_longer[6:7][mydf_longer[,1:4]==mydf_longer[2,2]] <- 

Any ideas? Let me know if you need more information.

Example of data:

#     id couple groep_MNC zkhs fbeh    pgebdat    p_age pgesl   prgebdat pr_age
# 1    3      1         1    1    1 1955-12-01 42.50000     1       <NA>     NA
# 1.1  3      2         1    1    1 1955-12-01 42.50000     1       <NA>     NA
# 2    5      1         1    1    1 1943-04-09 55.16667     1 1962-04-18   36.5
# 2.1  5      2         1    1    1 1943-04-09 55.16667     1 1962-04-18   36.5
# 3    7      1         1    1    1 1958-04-10 40.25000     1       <NA>     NA
# 3.1  7      2         1    1    1 1958-04-10 40.25000     1       <NA>     NA

mydf_long <- structure(
  list(id = c(3L, 3L, 5L, 5L, 7L, 7L),
       couple = c(1L, 2L, 1L, 2L, 1L, 2L),
       groep_MNC = c(1L, 1L, 1L, 1L, 1L, 1L),
       zkhs = c(1L, 1L, 1L, 1L, 1L, 1L),
       fbeh = c(1L, 1L, 1L, 1L, 1L, 1L),
       pgebdat = structure(c(-5145, -5145, -9764, -9764, -4284, -4284), class = "Date"),
       p_age = c(42.5, 42.5, 55.16667, 55.16667, 40.25, 40.25),
       pgesl = c(1L, 1L, 1L, 1L, 1L, 1L),
       prgebdat = structure(c(NA, NA, -2815, -2815, NA, NA), class = "Date"),
       pr_age = c(NA, NA, 36.5, 36.5, NA, NA)),
  .Names = c("id", "couple", "groep_MNC", "zkhs", "fbeh", "pgebdat",
             "p_age", "pgesl", "prgebdat", "pr_age"),
  row.names = c("1", "1.1", "2", "2.1", "3", "3.1"),
  class = "data.frame"
)
rawr
  • 20,481
  • 4
  • 44
  • 78
Hannie
  • 417
  • 5
  • 17
  • Try mydf_long$pgebdat <- ifelse(mydf_long$couple == 2, mydf_long$prgebdat, mydf_long$pgebdat) – Katie Sep 05 '17 at 14:36

1 Answers1

3

The following for loop should work if you only want to change the values based on a condition:

for(i in 1:nrow(mydata)){
  if(mydata$couple[i] == 2){
    mydata$pgebdat[i] <- mydata$prgebdat[i]
  }
}

OR

As suggested by @lmo, following will work faster.

mydata$pgebdat[mydata$couple == 2] <- mydata$prgebdat[mydata$couple == 2]
Sagar
  • 2,778
  • 1
  • 8
  • 16
  • 1
    It seems `mydata$pgebdat[mydata$couple == 2] <- mydata$prgebdat[mydata$couple == 2]` would produce equivalent results and be much faster. If quite long, you could split into 2 lines `repVec <- mydata$couple == 2` and then use repVec in the line above. – lmo Sep 05 '17 at 15:10
  • 1
    @lmo - Agreed. I added your solution above. – Sagar Sep 05 '17 at 15:17
  • Thanks! It is working. I I want to change certain amount of adjacent columns, this is not working: mydf_long[125:178][i] <- mydf_long[418:471][i] - i'm probably indexing wrong. I tried this as well: mydf_long[,125:178][i] <- mydf_long[,418:471][i] but it is not working. Anyone know how to index correctly? – Hannie Sep 05 '17 at 15:32
  • @HannekeLettinga - As far as I know, if you are specifying a range of columns you don't have to call that in a `for` loop with an iterator `i`. A simple `mydf_long[,125:178] <- mydf_long[,418:471]` should do it, as long as the number of columns in the given range is correct. – Sagar Sep 05 '17 at 15:37
  • @Sagar, Yes but then it replaces the complete range of columns - but I only want R to replace the range of columns for the partner... So when couple == 2, replace row 125:178 for the values that are in that row on 418:471... – Hannie Sep 05 '17 at 15:44
  • @HannekeLettinga - I think you mean you want to replace columns `125:178` with `418:471` values only where `mydf_long$couple == 2`. If that's the case, try this: `mydf_long[,125:178][mydf_long[, 2] == 2,] <- mydf_long[,418:471][mydf_long[, 2] == 2,]` – Sagar Sep 05 '17 at 17:52