2

I have a dataset looks like this:

name  position type
A       12      S
B       13      T
C       12      S
D       12      T
E       11      S
F       10      S

I would like to remove rows with duplicated position and type.

I tried to use the duplicated function to find duplicate rows, but I do not know how to remove all rows with duplicate values.

dup = db[duplicated(db[2:3]),]

I want to remove rows with same position and type, but different name. My desired output is:

name  position type
B       13      T
D       12      T
E       11      S
F       10      S
Sam Firke
  • 21,571
  • 9
  • 87
  • 105
BlueSky
  • 37
  • 5
  • Can you post some code? – Sandeep Nayak Feb 19 '16 at 14:35
  • 2
    Similar question: http://stackoverflow.com/q/7854433/1191259 – Frank Feb 19 '16 at 15:40
  • @akrun The question Frank links to is a helpful related post but I don't think this should be closed as it's not an exact duplicate. That one returns an index of duplicates, this one removes them. That one has examples with vectors, this asks about a data.frame. And this asks about just a subset of duplicated variables. – Sam Firke Feb 19 '16 at 18:00
  • 1
    @SamFirke Okay, I reopened it. – akrun Feb 20 '16 at 03:04

2 Answers2

5

The duplicated returns TRUE only from the duplicate value onwards. To return all the elements that are duplicates, we may need to apply the duplicated in the reverse i.e. from last value to first and use the OR condition i.e. |, negate and subset the dataset.

db[!(duplicated(db[2:3])|duplicated(db[2:3], fromLast=TRUE)),]
#   name position type
# 2    B       13    T
# 4    D       12    T
# 5    E       11    S
# 6    F       10    S
akrun
  • 874,273
  • 37
  • 540
  • 662
1

The dplyr package does this with intuitive, readable code.

Here's a toy example, taking rows from mtcars where there are no duplicated values of cyl and gear:

library(dplyr)
mtcars %>%
  group_by(cyl, gear) %>%
  filter(n() == 1) %>%
  ungroup()

Source: local data frame [2 x 11]

    mpg   cyl  disp    hp  drat    wt  qsec    vs    am  gear  carb
  (dbl) (dbl) (dbl) (dbl) (dbl) (dbl) (dbl) (dbl) (dbl) (dbl) (dbl)
1  21.5     4 120.1    97  3.70 2.465 20.01     1     0     3     1
2  19.7     6 145.0   175  3.62 2.770 15.50     0     1     5     6

Those two combinations of cyl and gear are the only unique ones, which you can confirm with:

mtcars %>%
  count(cyl, gear)
Sam Firke
  • 21,571
  • 9
  • 87
  • 105