0

I have a matrix called M, where each row only has two possible letters:

M <- structure(list(id1 = c("AA", "AB", "AA", "AC"), id2 = c("AA", 
"AB", "AA", "CC"), id3 = c("AA", "AA", "AB", "AC"), id4 = c("AA", 
"AB", "AB", "AA"), id5 = c("AA", "BB", "AA", "CC"), id6 = c("AA", 
"AB", "BB", "CC"), id7 = c("AA", "AB", "BB", "CC"), id8 = c("AA", 
"AB", "BB", "AC"), id9 = c("AA", "AB", "AB", "AA")), .Names = c("id1", 
"id2", "id3", "id4", "id5", "id6", "id7", "id8", "id9"), class = "data.frame", row.names = c(NA, 
-4L))
M
  # id1 id2 id3 id4 id5 id6 id7 id8 id9
# 1  AA  AA  AA  AA  AA  AA  AA  AA  AA
# 2  AB  AB  AA  AB  BB  AB  AB  AB  AB
# 3  AA  AA  AB  AB  AA  BB  BB  BB  AB
# 4  AC  CC  AC  AA  CC  CC  CC  AC  AA

I need to replace the letters so that, for each row, the first one is assigned 0 and the second 1.
So for row 2, A=0, B=1, for row 5 A=0, C=1.

I've been told to do it with the below for loop, but it doesn't seem to work, I only get results for one row back. Can anyone tell me what I'm doing wrong?

This is my code:

for (i in 1:500)
{
   results= M[i,]
   hold=unique(unlist(strsplit(unique(results),"")))
   hold=hold[is.na(hold)==F]
   sort(hold)
   results=gsub(hold[1],"0",results)
   results=gsub(hold[2],"1",results)
}
Cath
  • 23,906
  • 5
  • 52
  • 86
E_Schyler
  • 107
  • 6
  • 3
    You overwrite results each time. – Eli Korvigo Apr 27 '16 at 11:39
  • 4
    who told you to do it this way ? – Cath Apr 27 '16 at 11:40
  • thanks, so how would I go about not doing that – E_Schyler Apr 27 '16 at 11:40
  • Cath, a teacher at uni – E_Schyler Apr 27 '16 at 11:40
  • For starters, accumulate `results` somewhere outside the loop. FYI, `apply` is your friend. – Eli Korvigo Apr 27 '16 at 11:41
  • 7
    Welcome to Stack Overflow! Please read the info about [how to ask a good question](http://stackoverflow.com/help/how-to-ask) and how to give a [reproducible example](http://stackoverflow.com/questions/5963269). This will make it much easier for others to help you. – zx8754 Apr 27 '16 at 11:41
  • please give us a sample of data and what you want to achieve. You can keep your loop for "what I've already tried" ;-) – Cath Apr 27 '16 at 11:41
  • I wasn't really sure how to put all that code into apply, I couldn't find any similar examples which was why I used a loop. – E_Schyler Apr 27 '16 at 11:44
  • @ Cath, sorry I tried to just put up a sample but it didn't format, once I figure it out I'll try to put one up edit - Basically my sample data would look like this https://drive.google.com/file/d/0B5H0kYEI0J8qcGIyMjZGaERlUEU/view each row only has two possible letters, and the first one is assigned 0 and the second 1. so for row 1, A=0, B=1, for row 5 C=0, D=1 – E_Schyler Apr 27 '16 at 11:52
  • 2
    @E_Schyler Start with a simple loop that, say prints out the current iteration: `for(i in 1:500) {print(i)}`. Then gradually add complexity. Some of the stuff you are doing is pretty complicated. – lmo Apr 27 '16 at 11:53

1 Answers1

3

You can either define results prior to your loop and modify your loop to make it write in the right row of results at each turn:

results <- as.matrix(M)
for (i in 1:nrow(M)) {
   hold <- unique(unlist(strsplit(unique(results[i, ]), "")))
   hold <- hold[!is.na(hold)]
   hold <- sort(hold)
   results[i, ] <- gsub(hold[1], "0", results[i, ])
   results[i, ] <- gsub(hold[2], "1", results[i, ])
}

Or use a slightly different approach with apply and only sub/ gsub (I added the condition on length(u_lett) because the first row of example data only has 1 letter):

results <- t(apply(M, 1, 
                  function(x) {
                     u_lett <- sort(unique(c(sub("([A-Z])[A-Z]", "\\1", x), sub("[A-Z]([A-Z])", "\\1", x))))
                     x <- gsub(u_lett[1], "0", x)
                     if (length(u_lett)>1) x <- gsub(u_lett[2], "1", x)
                     x
                  }))
results
#     id1  id2  id3  id4  id5  id6  id7  id8  id9 
#[1,] "00" "00" "00" "00" "00" "00" "00" "00" "00"
#[2,] "01" "01" "00" "01" "11" "01" "01" "01" "01"
#[3,] "00" "00" "01" "01" "00" "11" "11" "11" "01"
#[4,] "01" "11" "01" "00" "11" "11" "11" "01" "00"

Or you can mix both to get a loop/sub-gsub or a apply/strsplit solution...

Cath
  • 23,906
  • 5
  • 52
  • 86
  • I know this say to avoid comments like thanks, but honestly thank you so much, that completely solved my issue. I was literally crying from stress tonight and now I can go to bed a bit less stressed. – E_Schyler Apr 27 '16 at 13:50
  • 1
    @E_Schyler you're welcome, I'm glad I contributed to a better sleep ;-). Don't hesitate to read tutorials and some Q&A on SO to develop your R skills ;-) – Cath Apr 27 '16 at 13:52
  • 1
    @E_Schyler If the answer worked for you, it would be appreciated if you accept the answer. This will give future readers a clue about the value of the solution. See also this help page: [What should I do when someone answers my question?](http://stackoverflow.com/help/someone-answers) – Jaap Apr 27 '16 at 13:58