2

I have been trying to extract a portion of string after the occurrence of a first ^ sign. For example, the string looks like abc^28092015^def^1234. I need to extract 28092015 sandwiched between the 1st two ^ signs.

So, I need to extract 8 characters from the occurrence of the 1st ^ sign. I have been trying to extract the position of the first ^ sign and then use it as an argument in the substr function.

I tried to use this:

x=abc^28092015^def^1234 `rev(gregexpr("\\^", x)[[1]])[1]`

Referring the answer discussed here.

But it continues to return the last position. Can anyone please help me out?

Community
  • 1
  • 1
Nirvik Banerjee
  • 335
  • 5
  • 16
  • 1
    You don't need regex (even though it can be done with regex too). Just [split](https://stat.ethz.ch/R-manual/R-devel/library/base/html/strsplit.html) the string by `^` and get the second element. – ndnenkov Sep 28 '15 at 12:41

5 Answers5

4

I would use sub.

x <- "^28092015^def^1234"
sub("^.*?\\^(.*?)\\^.*", "\\1", x)
# [1] "28092015"

Since ^ is a special char in regex, you need to escape that in-order to match literal ^ symbols.

or

Do splitting on ^ and get the value of second index.

strsplit(x,"^", fixed=-T)[[1]][2]
# [1] "28092015"

or

You may use gsub aslo.

gsub("^.*?\\^|\\^.*", "", x, perl=T)
# [1] "28092015"
Avinash Raj
  • 172,303
  • 28
  • 230
  • 274
3

Here's one option with base R:

x <- "abc^28092015^def^1234"
m <- regexpr("(?<=\\^)(.+?)(?=\\^)", x, perl = TRUE)
##
R> regmatches(x, m)
#[1] "28092015"
nrussell
  • 18,382
  • 4
  • 47
  • 60
2

Another option is stri_extract_first from library(stringi)

library(stringi)
stri_extract_first_regex(str1, '(?<=\\^)\\d+(?=\\^)')
#[1] "28092015"

If it is any character between two ^

stri_extract(str1, regex='(?<=\\^)[^^]+')
#[1] "28092015"

data

str1 <- 'abc^28092015^def^1234'
akrun
  • 874,273
  • 37
  • 540
  • 662
1
x <- 'abc^28092015^def^1234'
library(qdapRegex)
unlist(rm_between(x, '^', '^', extract=TRUE))[1]
# [1] "28092015"
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
1

It would be better if you split it using ^. But if you still want the pattern, you can try this.

^\S+\^(\d+)(?=\^)

Then match group 1.

OUTPUT

28092015

See DEMO

james jelo4kul
  • 839
  • 4
  • 17