R - Delete rows based on duplicate and values in another column

Question

I have a data.frame in R that looks like the following:

> inputtable <- data.frame(TN = c("T","N","T","N","N","T","T","N"),
+                     Value = c(1,1,2,2,2,3,3,5))
> inputtable
  TN Value
1  T     1
2  N     1
3  T     2
4  N     2
5  N     2
6  T     3
7  T     3
8  N     5

I want to remove values that duplicated in the Value column, but ONLY if one row has "T" and the other has "N" in the TN column.

I played around with duplicated, but this doesn't work the way I've coded it:

TNoverlaps.duprem <- TNoverlaps[ !(duplicated(TNoverlaps$Barcode) & ("T" %in% TNoverlaps$TN & "N" %in% TNoverlaps$TN)), ]

and

TNoverlaps.duprem <- TNoverlaps[ duplicated(TNoverlaps$Barcode) & !duplicated(TNoverlaps$Barcode, TNoverlaps$TN), ]

If there are more than two rows, as in rows 3-5 above, I want to remove all of those, because at least one is "T" and one is "N" in the TN column.

Here's the output I want

> outputtable
  TN Value
6  T     3
7  T     3
8  N     5

I found plenty of questions about duplicated rows, and removing rows based on multiple columns. But I didn't see one that did something like this.

Does the order matters? OR you just looking to remove rows that have more than one `unique` value? In other words, would this work ? `with(inputtable, ave(as.integer(TN), Value, FUN = function(x) length(unique(x)))) < 2` Or `table(unique(inputtable)$Value)[as.character(inputtable$Value)] < 2` ? — David Arenburg, Mar 23 '16 at 21:57
your `with` statement works exactly how I need it to. Can you please create an answer and explain how you are applying the `ave` command to this? Thank you! I implemented this with: `outputtable <- inputtable[with(inputtable, ave(as.integer(TN), Value, FUN = function(x) length(unique(x)))) < 2,]` — Gaius Augustus, Mar 23 '16 at 22:05

DatamineR · Accepted Answer · 2016-03-23T22:23:39.497

2

You could try:

library(dplyr)

inputtable %>% group_by(Value) %>% filter(!(n_distinct(TN) >= 2))
Source: local data frame [3 x 2]
Groups: Value [2]

      TN Value
  (fctr) (dbl)
1      T     3
2      T     3
3      N     5

edited Mar 23 '16 at 22:23

answered Mar 23 '16 at 22:07

DatamineR

10,428
3
25
45

1

What is the `n() >=2` for? And this a code only answer. – David Arenburg Mar 23 '16 at 22:13
@DavidArenburg It is unnecessary and can be removed :-) – DatamineR Mar 23 '16 at 22:22

R - Delete rows based on duplicate and values in another column

1 Answers1