I have a data frame with lots of columns. For example:
sample treatment col5 col6 col7
1 a 3 0 5
2 a 1 0 3
3 a 0 0 2
4 b 0 1 1
I want to select the sample
and treatment
columns plus all columns that meet the following 2 conditions:
- Their value on the row in which
treatment == 'b'
is 0 - Their value from at least one row where
treatment == 'a'
is not 0.
The expected result should look like this:
sample treatment col5
1 a 3
2 a 1
3 a 0
4 b 0
Example dataframe:
structure(list(sample = 1:4, treatment = structure(c(1L, 1L,
1L, 2L), .Label = c("a", "b"), class = "factor"), col5 = c(3,
1, 0, 0), col6 = c(0, 0, 0, 1), col7 = c(5, 3, 2, 1)), class = "data.frame", row.names = c(NA,
-4L))