-2

I have a large dataset, nearly a million observations representing a person. Each action of the person are noted separately but the each person has a unique reference number that is used to identify them. Using a particular characteristic I have flagged down certain rows. What I want now to do is remove all occurrences of that reference number from the list which has even once been flagged. The flag I have used is a binary flag.

Am an amateur in R.

How should I proceed?

  • 6
    Please read the info about [how to ask a good question](http://stackoverflow.com/help/how-to-ask) and how to give a [reproducible example](http://stackoverflow.com/questions/5963269). This will make it much easier for others to help you. – zx8754 Jun 16 '16 at 09:24
  • Something like this: `df1[ df1$id %in% unique(df1[df1$flag != "flagged", "id"]), ]` – zx8754 Jun 16 '16 at 09:45

2 Answers2

2

You can do

DT[, if (all(!flagged)) .SD, by=id]
# or
DT[, .SD[all(!flagged)], by=id]

to keep only ids that have no flags.

Frank
  • 66,179
  • 8
  • 96
  • 180
  • I'm hoping the syntax becomes more natural with a "having" parameter eventually. There's currently an open (probably low-priority) FR https://github.com/Rdatatable/data.table/issues/788 – Frank Jun 16 '16 at 10:08
0

You can use subset on your data frame. I took the liberty of generating a test data frame for your case.

# Just for generating the test data for demo purposes.
dataframe <- data.frame(1:5)
dataframe <- cbind(dataframe,c(1,0,1,0,0),rep(999,5))
colnames(dataframe) <- c("id","flag","data")

# Subset the data frame according to the flag.
selecteddata <- subset(dataframe, as.logical(dataframe$flag))

The original data frame:

> dataframe
  id flag data
1  1    1  999
2  2    0  999
3  3    1  999
4  4    0  999
5  5    0  999

The result:

> selecteddata
  id flag data
1  1    1  999
3  3    1  999
Anton
  • 1,458
  • 1
  • 14
  • 28
  • Yes, I understand the subsetting part but the ids are same in some cases and I want all the instances of observations having the same id removed even if only one them has been flagged. – user3709655 Jun 16 '16 at 09:45