1

I have data like this:

ID                SHape Length  
180139746001000           2

180139746001000           1

I want to delete the duplicate rows whichever has the less shape length. Can anyone help me with this?

neilfws
  • 32,751
  • 5
  • 50
  • 63
Parag Gupta
  • 31
  • 1
  • 1
  • 5

2 Answers2

1

with

df <- data.table(matrix(c(102:106,106:104,1:3,1:3,5:6),nrow = 8))
colnames(df) <- c("ID","Shape Length")

just use duplicated after sorting

setkey(df,"V2")
df[!duplicated(V1, fromLast = TRUE)]
hedgedandlevered
  • 2,314
  • 2
  • 25
  • 54
0

You can select the highest shape length for each ID by performing

df %>%
group_by(ID) %>%
arrange(SHape.Length) %>%
slice(1) %>%
ungroup()
Sonali J
  • 68
  • 8
  • I would recommend naming your column names without spaces as it is bad coding standard. you can fix that by doing a simple `stringr::str_replace_all("\\s","_")` – Sonali J Sep 26 '19 at 23:12
  • Thank you so much for this. I was in a hurry, hence, I wrote my names like that. But in the actual database, they are named without space. – Parag Gupta Sep 27 '19 at 00:11