Find All Unique rows based on single column and exclude all duplicate rows

Question

I have two requirements

find all duplicate values in single column
find all unique rows [opposite to first question] This should not include even single pair from duplicated rows

I'm Learning since last 2 weeks. Watching YouTube videos, Referring Stackoverflow and other websites, so not much. Please do refer if any material or courses.

so answer to my first question i found here (Find duplicated elements with dplyr)

# All duplicated elements
mtcars %>%
  filter(carb %in% unique(.[["carb"]][duplicated(.[["carb"]])]))

So i want opposite of this

Thanks

P.S. I have non technical background. I went through couple of questions and answers here, so i might have found the answer or needed some of tweaks and i totally ignored that

Do you need `mtcars %>% + filter(!(carb %in% unique(.[["carb"]][duplicated(.[["carb"]])])))` — akrun, Jul 16 '19 at 13:05
This is quite unclear. For one thing, "find all duplicate values in single column and return the all rows [2 or more]" isn't a grammatical sentence. For another thing, you haven't provided any idea about what your data looks like. — John Coleman, Jul 16 '19 at 13:05

score 2 · Accepted Answer · answered Jul 16 '19 at 13:19

As you probably realised, unique and duplicated don’t quite what you need, because they essentially cause the retention of all distinct values, and just collapse “multiple copies” of such values.

For your first question, you can group_by the column that you’re interested in, and then retain just those groups (via filter) which have more than one row:

mtcars %>%
    group_by(mpg) %>%
    filter(length(mpg) > 1) %>%
    ungroup()

This example selects all rows for which the mpg value is duplicated. This works because, when applied to groups, dplyr operations such as filter work on each group individually. This means that length(mpg) in the above code will return the length of the mpg column vector of each group, separately.

To invert the logic, it’s enough to invert the filtering condition:

mtcars %>%
    group_by(mpg) %>%
    filter(length(mpg) == 1) %>%
    ungroup()

Yes, Perfect. both of your answers works for me. Thank you. – 1S1a4m9 Jul 17 '19 at 09:04 — 1S1a4m9, Jul 17 '19 at 09:04

Find All Unique rows based on single column and exclude all duplicate rows

1 Answers1