Deleting duplicate rows based on logical operation in R

Question

I have data like this:

ID                SHape Length  
180139746001000           2

180139746001000           1

I want to delete the duplicate rows whichever has the less shape length. Can anyone help me with this?

Are you saying that you want to keep the row with the maximum shape length, by ID ? — neilfws, Sep 26 '19 at 23:09

score 1 · Answer 1 · answered Sep 26 '19 at 23:29

1

with

df <- data.table(matrix(c(102:106,106:104,1:3,1:3,5:6),nrow = 8))
colnames(df) <- c("ID","Shape Length")

just use duplicated after sorting

setkey(df,"V2")
df[!duplicated(V1, fromLast = TRUE)]

answered Sep 26 '19 at 23:29

hedgedandlevered

score 0 · Answer 2 · answered Sep 26 '19 at 23:10

0

You can select the highest shape length for each ID by performing

df %>%
group_by(ID) %>%
arrange(SHape.Length) %>%
slice(1) %>%
ungroup()

answered Sep 26 '19 at 23:10

Sonali J

I would recommend naming your column names without spaces as it is bad coding standard. you can fix that by doing a simple `stringr::str_replace_all("\\s","_")` – Sonali J Sep 26 '19 at 23:12
Thank you so much for this. I was in a hurry, hence, I wrote my names like that. But in the actual database, they are named without space. – Parag Gupta Sep 27 '19 at 00:11

2 Answers2