We can use regular expressions (regex) to check if the values in a column conform to your specifications. The regular expression I use below can be explained as follows:
^I-
checks if the value starts with "I-"; the circumflex (^
) character stands for the beginning of the line.
- after the prefix we expect any numeric value
[0-9]
and we want to repeat that check 9 times: {9}
.
- To make sure that after the 9 numeric values no additional values are present, we add the end of line anchor
$
.
df1 <- data.frame(column1 = c("I-123456789", "P-888888888", "Q"))
# Tidyverse
library(tidyverse)
df1 |>
mutate(check = str_detect(column1, "^I-[0-9]{9}$"))
#> column1 check
#> 1 I-123456789 TRUE
#> 2 P-888888888 FALSE
#> 3 Q FALSE
# Base R
df1$check <- grepl("I-[0-9]{9}$", df1$column1)
df1
#> column1 check
#> 1 I-123456789 TRUE
#> 2 P-888888888 FALSE
#> 3 Q FALSE