Modify DataFrame, remove double Data with for each, R

Question

Im about to modify a dataframe because it includes double values

Data Frame:                                           
Id Name Account                                                    
1    X    1                                       
1    Y    2                                             
1    Z    3                                 
2    J    1                                                
2    T    4                                                 
3    O    2

So when there are multiple rows with same Id I just want to keep the last row. The desired output would be

Id Name Account                                                                                             
1    Z    3                                                                          
2    T    4                                                 
3    O    2

This is my current Code:

 for (i in 1:(nrow(mylist)-1)) {

    if(mylist$Id[c(i)] == mylist$Id[c(i+1)]){
      
      mylist <- mylist[-c(i), ]
      
      
    } 
  }

I have Problems when a row is removed because all other rows get a lower index and the System skips rows in the next step.

score 1 · Accepted Answer · answered Jul 03 '20 at 16:31

1

You can do this easily with the dplyr package:

library(dplyr)

mylist %>%
 group_by(Id) %>%
 slice(n()) %>%
 ungroup()

First you group_by the Id column. Afterwards you select only the last entry (slice(n())) of each group.

answered Jul 03 '20 at 16:31

Cettt

11,460
7
35
58

Thanks :) I added it in my code with mylist <- (then your part). When I apply that lines with Ctrl + Enter it works and the list gets shorter. When I start the whole programm somehow the slice isnt applied and the data is as big as before. There is no Error :/ – Raphael Jul 03 '20 at 18:02

Daniel O · Answer 2 · 2020-07-03T18:14:42.533

1

One option in Base-R is

mylist[cumsum(sapply(split(mylist,mylist$Id),nrow)),]

  Id Name Account
3  1    Z       3
5  2    T       4
6  3    O       2

edited Jul 03 '20 at 18:14

answered Jul 03 '20 at 16:37

Daniel O

4,258
6
20

Modify DataFrame, remove double Data with for each, R

2 Answers2