11

I have tried the following, however, it goes wrong when the string contains any other character, say a space. As you can see below, there is a string called "subway 10", which does contain numeric characters, however, it is reported as false because of the space.

My string may contain any other character, but if it contains at least a single digit, I would like to get the indices of those strings from the array.

> mywords<- c("harry","met","sally","subway 10","1800Movies","12345")
> numbers <- grepl("^[[:digit:]]+$", mywords) 
> letters <- grepl("^[[:alpha:]]+$", mywords) 
> both <- grepl("^[[:digit:][:alpha:]]+$", mywords) 
> 
> mywords[xor((letters | numbers), both)] # letters & numbers mixed 
[1] "1800Movies"
zx8754
  • 52,746
  • 12
  • 114
  • 209
saltandwater
  • 791
  • 2
  • 9
  • 25
  • 3
    I might be missing something, but why don't you use `"[[:digit:]]+"`? – Roland Oct 28 '15 at 14:04
  • 2
    The problem is your use of the anchors `^` & `$`; e.g. `"^[[:digit:]]+$"` is checking if a string contains *only* numbers. – nrussell Oct 28 '15 at 14:04

1 Answers1

26

using \\d works for me:

grepl("\\d", mywords)
[1] FALSE FALSE FALSE  TRUE  TRUE  TRUE

so does [[:digit:]]:

grepl("[[:digit:]]", mywords)
[1] FALSE FALSE FALSE  TRUE  TRUE  TRUE

As @nrussel mentionned, you're testing if the strings contain only digits between the beginning ^ of the string till the end $.

You could also check if the strings contain something else than letters, using ^ inside brackets to negate the letters, but then "something else" is not only digits:

grepl("[^a-zA-Z]", mywords)
[1] FALSE FALSE FALSE  TRUE  TRUE  TRUE
Cath
  • 23,906
  • 5
  • 52
  • 86