I have a matrix of ones and zeros, the rownames of which are IDs of different individuals. I'm interested in the by-row order of appearance of a value of 1:
mat <- structure(c(0, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0, 0, 1, 0, 0, 0, 0,
0, 0, 0, 0, 1, 0, 1, 1, 0, 1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0,
0, 1, 0, 0, 0, 1, 1, 0, 1, 0, 0, 1, 1, 1, 1, 1, 1), .Dim = c(5L,
11L), .Dimnames = list(c("2099", "2101", "2102", "2103", "2114"
), c("MAW", "MAE", "SacMouth", "Georgiana", "Bottom_routeB",
"Top_routeB", "RM61", "RM80_5", "RM71", "RM88", "Release")))
MAW MAE SacMouth Georgiana Bottom_routeB Top_routeB RM61 RM80_5 RM71 RM88 Release
2099 0 0 0 0 0 0 0 0 0 0 1
2101 0 0 0 0 1 1 0 0 0 1 1
2102 1 1 1 0 0 0 0 0 0 0 1
2103 0 1 0 0 1 1 0 0 1 0 1
2114 0 0 0 0 1 1 0 1 1 1 1
For each individual, I need to extract the column name where a value of 1 appears for the first time, from left to right. For example, the output from the above example should be:
ID colname
2099 Release
2101 Bottom_routeB
2102 MAW
2103 MAE
2114 Bottom_routeB
My first attempt was a for-loop with cumsum
, then using tidyr
to gather a data frame by ID, colnames, and cumsum
value, then filtering out the rows where value == 1
, but that didn't work on rows where a 1 is followed by one or more zeros (as in ID #2103
above), because it returns multiple rows:
df <- as.data.frame(matrix(rep(NA, length(nrow(mat)*length(ncol(mat)))),
nrow = nrow(mat), ncol = (ncol(mat))))
colnames(df) <- colnames(mat)
# fill in df dataframe:
for (i in 1:nrow(mat))
{
df[ i, ] = cumsum(mat[i, 1:ncol(mat)])
}
df$TagID <- rownames(mat)
library(tidyverse)
df <- gather(df, key = Station, value = value, -TagID)
df2 <- df %>%
group_by(TagID) %>%
mutate(laststation = Station[value == 1]) %>%
filter(!duplicated(TagID))
Error: incompatible size (3), expecting 11 (the group size) or 1
Any help would be much appreciated. I did find this related post, and it seems like solution 3 might apply to my situation, but I couldn't adapt it successfully.