4

I am trying to identify column values in a data frame that have repeating number sequence. For instance

> df
   ColA
1 66046
2 73947
3 67456
4 67217
5 66861
6 67658

I want to return 66046, 66861 as 6 appears in succession. I have tried the following...

df %>% filter(str_detect(as.String(df[1]), "[66]"))  #with and without the squared brackets.
df[unlist(gregexpr("[6]{2}[[:digit:]]", df[1])), ][1]

Obvious to say, this doesn't work. Any help is appreciated.

Thanks

Sanjeev
  • 43
  • 2

3 Answers3

3

We can specify the count with

library(dplyr)
library(stringr)
df %>%
   filter(str_detect(ColA, "6{2,}"))

-output

#   ColA
#1 66046
#5 66861

data

df <- structure(list(ColA = c(66046L, 73947L, 67456L, 67217L, 66861L, 
67658L)), class = "data.frame", row.names = c("1", "2", "3", 
"4", "5", "6"))
akrun
  • 874,273
  • 37
  • 540
  • 662
  • Thanks a ton.. That worked.. Is there a way I could retrieve values where any of the numbers repeat in succession.. Not just 6.. I tried [[:digits:]] in place of 6 but that returns everything. – Sanjeev Feb 11 '21 at 22:22
  • ColA 1 66006 2 67003 3 67226 4 73933 – Sanjeev Feb 11 '21 at 22:33
2

Use

library(dplyr)
library(stringr)
df %>%
   filter(str_detect(ColA, "(\\d)\\1"))

See proof

NODE EXPLANATION
( group and capture to \1:
\d digits (0-9)
) end of \1
\1 what was matched by capture \1
Ryszard Czech
  • 18,032
  • 4
  • 24
  • 37
0

Late to the party but here's a solution in base R:

df[which(grepl("(\\d)\\1", df$ColA)),]
Chris Ruehlemann
  • 20,321
  • 4
  • 12
  • 34