1

I have a data frame and it looks something like the first df below. Theres duplicates in col1 but not col2. I want to remove all of the duplicate rows except the first row so that it looks like the second df below.

col1 col2
x 1
x 2
x 3
y 1
y 2
y 3
col1 col2
x 1
y 1

I tried this but it didn't work:

df %>% group_by(col1) %>% filter(duplicated(col1) | n()!=1)

blub_9
  • 87
  • 8

1 Answers1

-1

We need just distinct

library(dplyr)
distinct(df, col1, .keep_all = TRUE)
  col1 col2
1    x    1
2    y    1

Or if we want to use duplicated, negate (!) and return the first row

df %>%
    filter(!duplicated(col1))
  col1 col2
1    x    1
2    y    1

data

df <- structure(list(col1 = c("x", "x", "x", "y", "y", "y"), col2 = c(1L, 
2L, 3L, 1L, 2L, 3L)), class = "data.frame", row.names = c(NA, 
-6L))
akrun
  • 874,273
  • 37
  • 540
  • 662