R - How to extract a subset of a data frame containing many observations of two factors, each with multiple levels based on conditions?

Question

I have a data frame in R with the first factor having 3 levels [A,B,C] and the second factor having 3 levels [1,2,3]. This results in the data frame.

Alphabet <- c ("A","A","A","B","B","B","C","C")
L <- c (1,2,1,1,3,1,3,3)
df = data.frame(Alphabet, L)

I want to subset the frame based on the criteria that if a level in Alphabet has a 3, then the row should be dropped. However, this should only happen if another observation at the same level has either a 1 or a 2 in there.

So in the above example row, 5 will be dropped because B is also associated with a 1 in rows 4 and 6. Rows 7 and 8 will not be dropped because C is not associated with either 1 or 2.

If I understand correctly, in your example above, the subset will only consist of rows 7 and 8, as they are the only rows that the level of the first factor is associated with only one level from the second factor? — Constantinos, May 29 '17 at 19:11
**From review queue:** Welcome to StackOverflow - please read [How to make a great R reproducible example?](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example/5963610#5963610) and edit your question afterwards. — help-info.de, May 29 '17 at 19:14
They will include all rows except 5. The idea is that keep all values associated with 1,2. Drop the rows containing 3 if and only if there is a 1 or 2 associated with the level in the Alphabet. — ZMP, May 29 '17 at 19:15

score 1 · Accepted Answer · answered May 29 '17 at 19:15

1

Here is a dplyr solution:

library(dplyr)
group_by(df, Alphabet) %>% filter(!(L == 3 & any(L %in% c(1, 2))))

answered May 29 '17 at 19:15

scoa

19,359
5
65
80

score 0 · Answer 2 · answered May 30 '17 at 03:49

0

Or we can use data.table

library(data.table)
setDT(df)[df[, .I[!(L==3 & any(L %in% 1:2))], Alphabet]$V1]

answered May 30 '17 at 03:49

akrun

874,273
37
540
662

R - How to extract a subset of a data frame containing many observations of two factors, each with multiple levels based on conditions?

2 Answers2