Return the first occurrence of a character in a string

Question

I have been trying to extract a portion of string after the occurrence of a first ^ sign. For example, the string looks like abc^28092015^def^1234. I need to extract 28092015 sandwiched between the 1st two ^ signs.

So, I need to extract 8 characters from the occurrence of the 1st ^ sign. I have been trying to extract the position of the first ^ sign and then use it as an argument in the substr function.

I tried to use this:

x=abc^28092015^def^1234 `rev(gregexpr("\\^", x)[[1]])[1]`

Referring the answer discussed here.

But it continues to return the last position. Can anyone please help me out?

You don't need regex (even though it can be done with regex too). Just [split](https://stat.ethz.ch/R-manual/R-devel/library/base/html/strsplit.html) the string by `^` and get the second element. — ndnenkov, Sep 28 '15 at 12:41

Avinash Raj · Answer 1 · 2015-09-28T13:04:01.830

4

I would use sub.

x <- "^28092015^def^1234"
sub("^.*?\\^(.*?)\\^.*", "\\1", x)
# [1] "28092015"

Since ^ is a special char in regex, you need to escape that in-order to match literal ^ symbols.

or

Do splitting on ^ and get the value of second index.

strsplit(x,"^", fixed=-T)[[1]][2]
# [1] "28092015"

or

You may use gsub aslo.

gsub("^.*?\\^|\\^.*", "", x, perl=T)
# [1] "28092015"

edited Sep 28 '15 at 13:04

answered Sep 28 '15 at 12:45

Avinash Raj

172,303
28
230
274

score 3 · Answer 2 · answered Sep 28 '15 at 12:44

3

Here's one option with base R:

x <- "abc^28092015^def^1234"
m <- regexpr("(?<=\\^)(.+?)(?=\\^)", x, perl = TRUE)
##
R> regmatches(x, m)
#[1] "28092015"

answered Sep 28 '15 at 12:44

nrussell

18,382
4
47
60

akrun · Answer 3 · 2015-09-28T13:07:08.667

2

Another option is stri_extract_first from library(stringi)

library(stringi)
stri_extract_first_regex(str1, '(?<=\\^)\\d+(?=\\^)')
#[1] "28092015"

If it is any character between two ^

stri_extract(str1, regex='(?<=\\^)[^^]+')
#[1] "28092015"

data

str1 <- 'abc^28092015^def^1234'

edited Sep 28 '15 at 13:07

answered Sep 28 '15 at 12:48

akrun

874,273
37
540
662

score 1 · Answer 4 · answered Sep 28 '15 at 12:42

1

x <- 'abc^28092015^def^1234'
library(qdapRegex)
unlist(rm_between(x, '^', '^', extract=TRUE))[1]
# [1] "28092015"

answered Sep 28 '15 at 12:42

Ronak Shah

377,200
20
156
213

james jelo4kul · Answer 5 · 2015-09-28T13:11:48.080

1

It would be better if you split it using ^. But if you still want the pattern, you can try this.

^\S+\^(\d+)(?=\^)

Then match group 1.

OUTPUT

28092015

See DEMO

edited Sep 28 '15 at 13:11

answered Sep 28 '15 at 13:02

james jelo4kul

839
4
17

Return the first occurrence of a character in a string

5 Answers5

data