How I can remove multiple repeated rows in R

Question

I have a database with 100 rows. Here is a small sample of data

df<-read.table (text=" Id   Colour  Class   val Group
'P' 'NA'    'NA'    'NA'    '1'
'Q' 'NA'    'NA'    'NA'    '2'
'12'    'Red'   'A' '12'    '3'
'P' 'NA'    'NA'    'NA'    '1'
'Q' 'NA'    'NA'    'NA'    '2'
'Z' 'Yellow'    'M' '9' '20'
'P' 'Blue'  'M' '30'    '50'


    ", header=TRUE)

As you can see rows P and Q are repeated. I want to remove rows P and Q at the bottom to get this outcome

   Id Colour Class val Group
    1  P   <NA>  <NA>  NA     1
    2  Q   <NA>  <NA>  NA     2
    3 12    Red     A  12     3
    6  Z Yellow     M   9    20
    7  P   Blue     M  30    50

Using the following codes, I can get the outcome. However, this does not help me as the Id names are sometimes different and it is also tedious to check the rows of interest to remove. Can we do better?

df[-c(4,5), ]

pwilcox · Accepted Answer · 2021-02-17T19:23:18.587

1

You can use unique, which is in the base:

unique(df)

This will reduce the two 'Q' rows to one, and the three 'P' rows to two, as you show you want in your output.

edited Feb 17 '21 at 19:23

answered Feb 17 '21 at 19:13

pwilcox

5,542
1
19
31

Thanks, I have updated the question – user330 Feb 17 '21 at 19:17
Updated the answer in response, but the core recommendation is the same. – pwilcox Feb 17 '21 at 19:23

How I can remove multiple repeated rows in R

1 Answers1