0

I want to logically, find IDs in a file. These will have only digits, letters, and dashes. They must containa digit to be considered. I could do a Boolean with 2 grepl statements but want to do this with a single regex. I think (SKIP)(FAIL) could work but don't know how. In the following I want elements 1, 2, 5, 6 to be considered IDs.

g <- c(
    "868776767-ddd-dFFF-999999",
    "8888888",
    "bbbbbbfdfdgtfref-dsfcsdbcgwecbgfecshdcs-cdhscgbfsd",
    "bigbird",
    "2",
    "3-4",
    "swe%h"
)

## This works (I want this result with one regex)
grepl("[A-Za-z0-9-]+", g) & grepl("[0-9]+", g)

## I suspect using this could work with a single regex call.
grepl("(*SKIP)(*FAIL)", g)
Tyler Rinker
  • 108,132
  • 65
  • 322
  • 519
  • Check this article http://stackoverflow.com/questions/469913/regular-expressions-is-there-an-and-operator – Mariano Sep 16 '15 at 17:32

2 Answers2

2

No need to search complicated things:

grepl("^[a-fA-F-]*[0-9][[:xdigit:]-]*$", g)

or

grepl("^[a-fA-F-]*+[[:xdigit:]-]+$", g, perl=T)

where [:xdigit:] is the POSIX character class that contains [a-fA-F0-9]. The second version uses a possessive quantifier to ensure that the next character is a digit.

If you want to ensure that there is no leading, trailing or consecutive hyphens:

grepl("^(?:[a-fA-F]+(?:-[a-fA-F]+)*)?[0-9][[:xdigit:]]*(?:-[[:xdigit:]]+)*$", g)
Casimir et Hippolyte
  • 88,009
  • 5
  • 94
  • 125
0

You can use the following:

^(?=.*\d)[a-zA-Z0-9-]*$

Explanation:

  • ^ : start of the string
  • (?=.*\d) : look ahead for atleast one digit
  • [a-zA-Z0-9-]+ : match more than one alpha or digit or -
  • $ : end of the string

Output:

grepl("^(?=.*\\d)[a-zA-Z0-9-]*$", g, perl=TRUE)
## [1]  TRUE  TRUE FALSE FALSE  TRUE  TRUE FALSE
Tyler Rinker
  • 108,132
  • 65
  • 322
  • 519
karthik manchala
  • 13,492
  • 1
  • 31
  • 55