Remove partially identical rows in dplyr

Question

A simple problem I can't seem to figure out!

> ID = c(1, 1, 2)
> var = c("A", NA, NA)
> d <- data.frame(ID, var)
> d
  ID  var
1  1    A
2  1 <NA>
3  2 <NA>

What I want to be able to do is remove the 2nd row using dplyr, based on the fact that there is a row with the same ID that has a value for var. If there is only a single ID, I don't want to remove it.

One of these? https://stackoverflow.com/questions/13279582/select-the-first-row-by-group — william3031, Jun 07 '19 at 06:36
Possible duplicate of [Select the first row by group](https://stackoverflow.com/questions/13279582/select-the-first-row-by-group) — william3031, Jun 07 '19 at 06:37

score 0 · Answer 1 · answered Jun 07 '19 at 08:58

How about grouping by ID and select the first value arranged so that NA is never selected if there is avlue in that group?

ID = c(1, 1, 2)
var = c("A", NA, NA)
d <- data.frame(ID, var)

d %>% 
  group_by(ID) %>% 
  arrange() %>% 
  slice(1) %>% 
  ungroup()

Results in:

# A tibble: 2 x 2
     ID var  
  <dbl> <chr>
1     1 A    
2     2 NA

Remove partially identical rows in dplyr

1 Answers1