51

In R, is there a better/simpler way than the following of finding the location of the last dot in a string?

x <- "hello.world.123.456"
g <- gregexpr(".", x, fixed=TRUE)
loc <- g[[1]]
loc[length(loc)]  # returns 16

This finds all the dots in the string and then returns the last one, but it seems rather clumsy. I tried using regular expressions, but didn't get very far.

Roman Luštrik
  • 69,533
  • 24
  • 154
  • 197
Hong Ooi
  • 56,353
  • 13
  • 134
  • 187

4 Answers4

80

Does this work for you?

x <- "hello.world.123.456"
g <- regexpr("\\.[^\\.]*$", x)
g
  • \. matches a dot
  • [^\.] matches everything but a dot
  • * specifies that the previous expression (everything but a dot) may occur between 0 and unlimited times
  • $ marks the end of the string.

Taking everything together: find a dot that is followed by anything but a dot until the string ends. R requires \ to be escaped, hence \\ in the expression above. See regex101.com to experiment with regex.

Flo
  • 1,503
  • 1
  • 18
  • 35
Vincent
  • 15,809
  • 7
  • 37
  • 39
  • 1
    a '.' matches every possible character, to match a literal '.' you need to escape it with a '\' and unfortunatly, you need to escape this '\' with another '\'. So finally your expression looks like '\\.' – CousinCocaine Apr 16 '14 at 09:28
  • @Vincent, is it there a document or a package where all the symbols and text paterns are explained in detail? – R18 Apr 18 '23 at 10:10
31

How about a minor syntax improvement?

This will work for your literal example where the input vector is of length 1. Use escapes to get a literal "." search, and reverse the result to get the last index as the "first":

 rev(gregexpr("\\.", x)[[1]])[1]

A more proper vectorized version (in case x is longer than 1):

 sapply(gregexpr("\\.", x), function(x) rev(x)[1])

and another tidier option to use tail instead:

sapply(gregexpr("\\.", x), tail, 1)
mdsumner
  • 29,099
  • 6
  • 83
  • 91
7

Someone posted the following answer which I really liked, but I notice that they've deleted it:

regexpr("\\.[^\\.]*$", x)

I like it because it directly produces the desired location, without having to search through the results. The regexp is also fairly clean, which is a bit of an exception where regexps are concerned :)

Hong Ooi
  • 56,353
  • 13
  • 134
  • 187
  • 1
    yeah, that was me. I thought the previous solution worked so I deleted it. Maybe I shouldn't have :) – Vincent Mar 07 '11 at 02:11
2

There is a slick stri_locate_last function in the stringi package, that can accept both literal strings and regular expressions.

To just find a dot, no regex is required, and it is as easy as

stringi::stri_locate_last_fixed(x, ".")[,1]

If you need to use this function with a regex, to find the location of the last regex match in the string, you should replace _fixed with _regex:

stringi::stri_locate_last_regex(x, "\\.")[,1]

Note the . is a special regex metacharacter and should be escaped when used in a regex to match a literal dot char.

See an R demo online:

x <- "hello.world.123.456"
stringi::stri_locate_last_fixed(x, ".")[,1]
stringi::stri_locate_last_regex(x, "\\.")[,1]
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563