Group by 1 continuous and multiple logical values

Question

I have data as follows:

eg_data <- data.frame(
id = c(1,1,1,1,1,2,2,2,2,2,3,3,3,3,3,4,4,4,4,4),
date = c("11/1", "11/1", "11/2", "11/1", "11/5", "11/5", "11/4", "11/5", "11/4", "11/2", "11/4", "11/3", "11/3", "11/2", "11/3", "11/2", "11/1", "11/1", "11/2", "11/3"),
sales = c(2,3,2,3,4,5,4,5,6,2,3,4,7,6,5,4,6,4,3,5),
dupes = c(F,T,F,T,F,F,F,T,T,F,F,F,T,F,T,F,F,T,T,F),
dupes2 = c(F,F,F,T,F,F,F,T,F,F,F,F,F,F,F,F,F,F,F,F))

dupes are duplicates by date, dupes2 are duplicates by date + sales

I need to flag any instances where dupes = TRUE and dupes2 = FALSE. I need this done at the ID level, IE this condition exists once for id=1, every row where id=1 will be flagged as a result.

I have tried something like:

eg_data <- eg_data %>% group_by(id, dupes=TRUE, dupes2=FALSE) %>% mutate(flag=1)

This obviously doesn't work, but that's the idea. For all IDs that have any row where dupe = T and dupe2 = F, flag all iterations of that id with 1.

The end result would be the data above with a column called flag that = 1, b/c for every id 1-4, there is at least one row where dupes = T and dupes2 = F. I need to add a column to the dataset, not filter it to a list that prints, not create a separate dataset.

I have looked at

dplyr group_by logical values

and

Grouping functions (tapply, by, aggregate) and the *apply family

but neither did it for me.

Any help is appreciated.

`eg_data%>%group_by('id')%>%mutate(flag=any(dupes&!dupes2)) ` — BENY, Dec 28 '18 at 19:08
@W-B I'm pretty sure you got it, why do you have id in single quotes? — Adam_S, Dec 28 '18 at 19:44
@Adam_S ummm ,yep forgot I post an R solution ..(Thought I am writing an pandas' solution .) we should remove the quote — BENY, Dec 28 '18 at 19:46
Please post it as the answer b/c when I removed the single quotes, it did exactly what I needed. I applied it to the actual dataset I have, and it worked just fine. Thank you! — Adam_S, Dec 28 '18 at 19:48

score 1 · Accepted Answer · answered Dec 28 '18 at 21:14

1

As per Op write into an answer using any

eg_data = eg_data %>% group_by(id) %>% mutate(flag=any(dupes&!dupes2))

answered Dec 28 '18 at 21:14

BENY

317,841
20
164
234

Group by 1 continuous and multiple logical values

1 Answers1