I am playing around with binary data.
I have test data in columns in the following manner:
A B C D E F G H I J K L M N
-----------------------------------------------------
1 1 1 1 1 1 1 1 1 0 0 0 0 0
0 0 0 0 1 1 1 0 1 1 0 0 1 0
1
is indicating that the system was on and 0
indicating that the system was off.
I have a way to figure out a way to summarize the gaps between the on/off transition of these systems.
For example:
for the first row, it stops working after
I
for the second row, it works from
E
toG
and then works again inI
andM
but is off during others.
I see my result in the following form (table1
)
row-number value grp_num num Range
------------ ----- -------- ------ ------
1 1 1 9 A-I
1 0 2 5 J-N
2 0 1 4 A-D
2 1 2 3 E-G
2 0 3 1 H-H
2 1 4 2 I-J
2 0 5 2 K-L
2 1 6 1 M-M
2 0 7 1 N-N
The code I used is this:
table1 <- test[,-c(1)] %>%
rownames_to_column() %>%
gather(col,val,-rowname) %>%
group_by(rowname) %>%
mutate(grp_num = cumsum(val != lag(val, default = -99))) %>%
group_by(rowname,val,grp_num) %>%
dplyr::summarise(num = n(),
range = paste0(first(col), "-", last(col)))
My question here is if my data had blank entries, how can I exclude them from being a part of a group.
A B C D E F G H I J K L M N
-----------------------------------------------------
1 1 1 1 1 1 1 1 0 0 0 0 0
1 1 1 0 1 1 0 0 1 0
The expected result is very similar but excluding the blank values
row-number value grp_num num Range
------------ ----- -------- ------ ------
1 1 1 8 B-I
1 0 2 5 J-N
2 1 1 3 E-G
2 0 2 1 H-H
2 1 3 2 I-J
2 0 4 2 K-L
2 1 5 1 M-M
2 0 6 1 N-N