0

I want to check that the current row value in name exists in the current column.

I tried this which did not work as expected:

> tibble(name = c("tommy", "tommy", "bobby")) |> 
+     mutate(row = row_number()) |> 
+     mutate(match = name %in% name[row != row_number()])
#  A tibble: 3 × 3
#  name    row match
#  <chr> <int> <lgl>
#1 tommy     1 FALSE
#2 tommy     2 FALSE
#3 bobby     3 FALSE

name == "bobby" should not have a match since it has no identical values within the column. Specifically, the results I was looking for was this:

> tibble(name = c("tommy", "tommy", "bobby")) |> 
+     mutate(row = row_number()) %>%
+     mutate(match = c(TRUE, TRUE, FALSE))
#  A tibble: 3 × 3
#  name    row match
#  <chr> <int> <lgl>
#1 tommy     1 TRUE 
#2 tommy     2 TRUE 
#3 bobby     3 FALSE
sparklink
  • 41
  • 2
  • So you're essentially looking if the value in `name` is duplicated, right? Like `dat %>% mutate(match = duplicated(name) | duplicated(name, fromLast=TRUE))` ? Possibly - https://stackoverflow.com/questions/7854433/finding-all-duplicate-rows-including-elements-with-smaller-subscripts as a duplicate *question* – thelatemail Aug 01 '23 at 23:18

1 Answers1

4
tibble(name = c("tommy", "tommy", "bobby")) |> 
     mutate(match = n() > 1, .by = name)
# # A tibble: 3 × 2
#   name  match
#   <chr> <lgl>
# 1 tommy TRUE 
# 2 tommy TRUE 
# 3 bobby FALSE
Gregor Thomas
  • 136,190
  • 20
  • 167
  • 294