Delete rows that are repeated for only some of the columns of the dataframe

Question

Considering the sample dataframe as:

df <- data.frame(a=c(rep(1,4),4,7,8), b=c(rep(4,4),6,8,3), 
                 c=c(rep("hey",4),"hi","hello","salam"), 
                 d=c("q","r","g","y","d","e","y"), e=c(2,6,43,56,6,23,4))

I want to remove the rows that are the same for columns a, b, c. The desired output would be three rows as

    a b     c     d   e
1   1 4    hey    q   2 
5   4 6    hi     d   6
6   7 8   hello   e   23
7   8 3   salam   y   4

You can also do `df %>% group_by(a, b, c) %>% filter(row_number() == 1)` — Ronak Shah, Jan 28 '21 at 07:12

score 1 · Accepted Answer · answered Jan 28 '21 at 06:55

1

I think you forgot the first row

df[!duplicated(df[,c("a","b","c")]),]

  a b     c d  e
1 1 4   hey q  2
5 4 6    hi d  6
6 7 8 hello e 23
7 8 3 salam y  4

answered Jan 28 '21 at 06:55

user2974951

9,535
1
17
24

Yes you are right! I will update! – Maral Dorri Jan 28 '21 at 06:58

score 1 · Answer 2 · answered Jan 28 '21 at 07:04

1

dplyr solution is:

library(dplyr)
df %>% distinct(a, b, c, .keep_all = TRUE)

answered Jan 28 '21 at 07:04

nyk

670
5
11

Delete rows that are repeated for only some of the columns of the dataframe

2 Answers2