R - Extracting duplicates to a dataframe

Question

I need help with R, similar to question filtering-a-dataframe-showing-only-duplicates I wish to extract duplicates from a dataframe with over 2,000 entries.

The first 15 rows of data looks like this:

run	id	Diff
1	20	0
1	4	1024
1	4	1
1	4	1
1	4	65
1	4	1
1	4	1
1	11	475
1	11	1
1	11	1
2	25	0
2	18	0
2	18	1
2	18	1
2	18	1

I wish to extract only the duplicates, i.e.

run	id	Diff
1	4	1024
1	4	1
1	4	1
1	4	65
1	4	1
1	4	1
1	11	475
1	11	1
1	11	1
2	18	0
2	18	1
2	18	1
2	18	1

Using the command

mydata_extract %>% group_by(id) %>% filter(n() > 1) does not extract the data, in fact I get the complete set of data returned. Is there something about "filter(n() > 1)" that I need to change? I'm a beginner with R. Sorry my data table is not formatting correctly, it looks okay in preview!

I will also want to group my data first by "run"

Is this simply `mydata_extract[duplicated(mydata_extract[1:2]), ]`? — Rui Barradas, May 02 '22 at 11:22

score 1 · Accepted Answer · answered May 02 '22 at 11:20

Maybe add run and id in the group_by()?

  library(dplyr)
   df <- tibble::tribble(
      ~"run", ~"id", ~"Diff",
      1, 20, 0,
      1, 4, 1024,
      1, 4, 1,
      1, 4, 1,
      1, 4, 65,
      1, 4, 1,
      1, 4, 1,
      1, 11, 4,
      1, 11, 1,
      1, 11, 1,
      2, 25, 0,
      2, 18, 0,
      2, 18, 1,
      2, 18, 1,
      2, 18, 1
    ) %>% 
     group_by(run, id) %>% 
      filter(n()>1)



   # A tibble: 13 x 3
# Groups:   run, id [3]
     run    id  Diff
   <dbl> <dbl> <dbl>
 1     1     4  1024
 2     1     4     1
 3     1     4     1
 4     1     4    65
 5     1     4     1
 6     1     4     1
 7     1    11     4
 8     1    11     1
 9     1    11     1
10     2    18     0
11     2    18     1
12     2    18     1
13     2    18     1

You can add a mutate, to see how this n() works (counts the number of rows per group),e.g.

df %>% 
 group_by(run, id) %>% 
  mutate(n = n())

Many thanks, Julian, works perfectly and very efficiently. – Afada May 02 '22 at 15:55 — Afada, May 02 '22 at 15:55

R - Extracting duplicates to a dataframe

1 Answers1