0

I have a matrix of ones and zeros, the rownames of which are IDs of different individuals. I'm interested in the by-row order of appearance of a value of 1:

    mat <- structure(c(0, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0, 0, 1, 0, 0, 0, 0, 
0, 0, 0, 0, 1, 0, 1, 1, 0, 1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 1, 0, 0, 0, 1, 1, 0, 1, 0, 0, 1, 1, 1, 1, 1, 1), .Dim = c(5L, 
11L), .Dimnames = list(c("2099", "2101", "2102", "2103", "2114"
), c("MAW", "MAE", "SacMouth", "Georgiana", "Bottom_routeB", 
"Top_routeB", "RM61", "RM80_5", "RM71", "RM88", "Release")))


      MAW MAE SacMouth Georgiana Bottom_routeB Top_routeB RM61 RM80_5 RM71 RM88 Release
2099   0   0        0         0             0          0    0      0    0    0       1
2101   0   0        0         0             1          1    0      0    0    1       1
2102   1   1        1         0             0          0    0      0    0    0       1
2103   0   1        0         0             1          1    0      0    1    0       1
2114   0   0        0         0             1          1    0      1    1    1       1

For each individual, I need to extract the column name where a value of 1 appears for the first time, from left to right. For example, the output from the above example should be:

ID     colname
2099   Release
2101   Bottom_routeB
2102   MAW
2103   MAE
2114   Bottom_routeB

My first attempt was a for-loop with cumsum, then using tidyr to gather a data frame by ID, colnames, and cumsum value, then filtering out the rows where value == 1, but that didn't work on rows where a 1 is followed by one or more zeros (as in ID #2103 above), because it returns multiple rows:

df <- as.data.frame(matrix(rep(NA, length(nrow(mat)*length(ncol(mat)))),
nrow = nrow(mat), ncol = (ncol(mat))))

colnames(df) <- colnames(mat)

# fill in df dataframe:
for (i in 1:nrow(mat))
{
  df[ i, ] = cumsum(mat[i, 1:ncol(mat)]) 
}

df$TagID <- rownames(mat)

library(tidyverse)
df <- gather(df, key = Station, value = value, -TagID) 

df2 <- df %>% 
  group_by(TagID) %>% 
  mutate(laststation = Station[value == 1])  %>% 
  filter(!duplicated(TagID)) 

Error: incompatible size (3), expecting 11 (the group size) or 1 

Any help would be much appreciated. I did find this related post, and it seems like solution 3 might apply to my situation, but I couldn't adapt it successfully.

Community
  • 1
  • 1
Von
  • 156
  • 1
  • 11
  • 1
    `colnames(mat)[max.col(mat, ties.method = "first")]` – Ronak Shah Mar 23 '17 at 16:47
  • Thank you - I had never seen the `max.col` function before. If this is a duplicate question I can delete it. – Von Mar 23 '17 at 17:02
  • Thanks. For completion's sake, the final df can be gotten by: `df2 <- data.frame(cbind(ID = rownames(mat), laststation = colnames(mat)[max.col(mat, ties.method = "first")]))` – Von Mar 23 '17 at 17:13

0 Answers0