21

I have a list of strings in R which looks like:

WDN.TO
WDR.N
WDS.AX
WEC.AX
WEC.N
WED.TO

I want to get all the postfix of the strings starting from the character ".", the result should look like:

.TO
.N
.AX
.AX
.N
.TO

Anyone have any ideas?

Jim G.
  • 15,141
  • 22
  • 103
  • 166
user802231
  • 853
  • 2
  • 8
  • 10

3 Answers3

22

Joshua's solution works fine. I'd use sub instead of gsub though. gsub is for substituting multiple occurrences of a pattern in a string - sub is for one occurrence. The pattern can be simplified a bit too:

> x <- c("WDN.TO","WDR.N","WDS.AX","WEC.AX","WEC.N","WED.TO")
> sub("^[^.]*", "", x)
[1] ".TO" ".N"  ".AX" ".AX" ".N"  ".TO"

...But if the strings are as regular as in the question, then simply stripping the first 3 characters should be enough:

> x <- c("WDN.TO","WDR.N","WDS.AX","WEC.AX","WEC.N","WED.TO")
> substring(x, 4)
[1] ".TO" ".N"  ".AX" ".AX" ".N"  ".TO"
Tommy
  • 39,997
  • 12
  • 90
  • 85
  • could you possibly explain real quick how that pattern gets detected? I can't read `sub("^[^.]*", "", x)`. Thats a placeholder, then a filter with a placeholder but then what does the star do and why the empty `""` before the `x`? I can adapt the code so it works but I don't understand how it works... – Jakob Jun 01 '16 at 09:11
  • 2
    It's a regular expression pattern. The first `^` matches the beginning of the string, but the next one in square brackets negates, so it matches all characters EXCEPT "." - finally the star means match that any number of times - so match everything from the start up until (but not including) the first dot. Second argument then replaces that match with an empty string. – Tommy Jul 06 '16 at 22:58
14

Using gsub:

x <- c("WDN.TO","WDS.N")
# replace everything from the start of the string to the "." with "."
gsub("^.*\\.",".",x)
# [1] ".TO" ".N" 

Using strsplit:

# strsplit returns a list; use sapply to get the 2nd obs of each list element
y <- sapply(strsplit(x,"\\."), `[`, 2)
# since we split on ".", we need to put it back
paste(".",y,sep="")
# [1] ".TO" ".N"
Joshua Ulrich
  • 173,410
  • 32
  • 338
  • 418
0

Strsplit might do it but in case the data set is too large it will show an error subscript out of bounds

x <- c("WDN.TO","WDR.N","WDS.AX","WEC.AX","WEC.N","WED.TO")
y <- strsplit(x,".")[,2]
#output y= TO N AX AX N TO
Aayush Agrawal
  • 184
  • 1
  • 6