Filter duplicated rows in R data.frame

Question

I have a data.frame as shown below.

> df2 <- data.frame("StudentId" = c(1,1,1,2,2,3,3), "Subject" = c("Maths", "Maths", "English","Maths", "English", "Science", "Science"), "Score" = c(100,90,80,70, 60,20,10))
> df2
  StudentId Subject Score
1         1   Maths   100
2         1   Maths    90
3         1 English    80
4         2   Maths    70
5         2 English    60
6         3 Science    20
7         3 Science    10

Few StudentIds, have duplicated values for column Subject (example: ID 1 has 2 entries for "Maths". I need to keep only the first one of the duplicated rows. The expected data.frame is:

  StudentId Subject Score
1         1   Maths   100
3         1 English    80
4         2   Maths    70
5         2 English    60
6         3 Science    20

I am not able to do this. Any ideas.

Also [this](http://stackoverflow.com/questions/13967063/remove-duplicate-rows-in-r) and [this](http://stackoverflow.com/questions/13279582/select-only-the-first-rows-for-each-unique-value-of-a-column-in-r) — David Arenburg, Feb 08 '16 at 17:19

akrun · Accepted Answer · 2016-02-08T17:06:39.963

We can either use unique from data.table with the by option after converting to 'data.table' (setDT(df2))

library(data.table)
unique(setDT(df2), by = c("StudentId", "Subject"))
#   StudentId Subject Score
#1:         1   Maths   100
#2:         1 English    80
#3:         2   Maths    70
#4:         2 English    60
#5:         3 Science    20

Or distinct from 'df2'

library(dplyr)
distinct(df2, StudentId, Subject)
#     StudentId Subject Score
#       (dbl)  (fctr) (dbl)
#1         1   Maths   100
#2         1 English    80
#3         2   Maths    70
#4         2 English    60
#5         3 Science    20

Or duplicated from base R

df2[!duplicated(df2[1:2]),]

EDIT: Based on suggestions by @David Arenburg)

I think just `unique(setDT(df2), by = c("StudentId", "Subject"))`? or `distinct(df2, StudentId, Subject)`? — David Arenburg, Feb 08 '16 at 17:05

Filter duplicated rows in R data.frame

1 Answers1