how to identify a value does exist in the current column of values

Question

I want to check that the current row value in name exists in the current column.

I tried this which did not work as expected:

> tibble(name = c("tommy", "tommy", "bobby")) |> 
+     mutate(row = row_number()) |> 
+     mutate(match = name %in% name[row != row_number()])
#  A tibble: 3 × 3
#  name    row match
#  <chr> <int> <lgl>
#1 tommy     1 FALSE
#2 tommy     2 FALSE
#3 bobby     3 FALSE

name == "bobby" should not have a match since it has no identical values within the column. Specifically, the results I was looking for was this:

> tibble(name = c("tommy", "tommy", "bobby")) |> 
+     mutate(row = row_number()) %>%
+     mutate(match = c(TRUE, TRUE, FALSE))
#  A tibble: 3 × 3
#  name    row match
#  <chr> <int> <lgl>
#1 tommy     1 TRUE 
#2 tommy     2 TRUE 
#3 bobby     3 FALSE

So you're essentially looking if the value in `name` is duplicated, right? Like `dat %>% mutate(match = duplicated(name) | duplicated(name, fromLast=TRUE))` ? Possibly - https://stackoverflow.com/questions/7854433/finding-all-duplicate-rows-including-elements-with-smaller-subscripts as a duplicate *question* — thelatemail, Aug 01 '23 at 23:18

score 4 · Answer 1 · answered Aug 01 '23 at 23:04

4

tibble(name = c("tommy", "tommy", "bobby")) |> 
     mutate(match = n() > 1, .by = name)
# # A tibble: 3 × 2
#   name  match
#   <chr> <lgl>
# 1 tommy TRUE 
# 2 tommy TRUE 
# 3 bobby FALSE

answered Aug 01 '23 at 23:04

Gregor Thomas

136,190
20
167
294

That works perfectly. Thanks for that! – sparklink Aug 01 '23 at 23:51

how to identify a value does exist in the current column of values

1 Answers1