Extract word infront of certain character RegEx Rstudio

Question

I have this word, "sam buy expensive toys as 125898652". I would like to extract the word after "as", which is "125898652".

I'm using

(?<=as\s)+[^\s]+

I've tried it on https://regex101.com/r/NaWAl1/1 and it works pretty well. when i execute it on R it returning error as

Error: '\s' is an unrecognized escape in character string starting ""(?<='as'\s"

So I modify it to

(?<='CR'\s)+[^\s]+

It returning different error as :

Error in stri_extract_first_regex(string, pattern, opts_regex = opts(pattern)) : 
  Syntax error in regexp pattern. (U_REGEX_RULE_SYNTAX)

Can someone please explain it to me why regex different in R and how to make it works. Thank you so much

`stringi::stri_extract_first_regex("sam buy expensive toys as 125898652","(?<=as\\s)[^\\s]+")` works well for your case. Do not quantify lookarounds, they are zero-width assertions. And use double backslashes in string literals to define literal backslashes. — Wiktor Stribiżew, Dec 20 '19 at 10:58
i've used double backslash too for each of the double blackslash there but it still doesnt work — yuliansen, Dec 23 '19 at 01:32
[`(?<=as\s)+[^\s]+` works well](https://rextester.com/LJF82742) — Wiktor Stribiżew, Dec 23 '19 at 10:40

Ronak Shah · Accepted Answer · 2019-12-23T01:49:11.193

1

Using sub

sub(".*as\\s(\\w+).*", "\\1", "sam buy expensive toys as 125898652")
#[1] "125898652"

Or lookbehind regex

stringr::str_extract("sam buy expensive toys as 125898652", "(?<=as\\s)\\w+")
#[1] "125898652"

For words which has , in it and may have decimal places we can do

x <- "sam buy expensive toys as 128984,45697.00"
sub(".*as\\s(\\d+\\.?\\d+).*", "\\1",gsub(',', '', x))
#[1] "12898445697.00"

edited Dec 23 '19 at 01:49

answered Dec 20 '19 at 09:25

Ronak Shah

the first one amazingly work well, but it not returning the number after . or , – yuliansen Dec 23 '19 at 01:34
@yuliansen Can you share an example ? – Ronak Shah Dec 23 '19 at 01:35
sam buy expensive toys as 128984,45697.00 it only returns 128984. thank you so much for your response – yuliansen Dec 23 '19 at 01:37
1

so will this work ? `sub(".*as\\s(\\d+\\.?\\d+).*", "\\1",gsub(',', '', x))` where `x` is your string. – Ronak Shah Dec 23 '19 at 01:43
1

i dont know how to thank you so much but you really help me. i will dig into sub and gsub later. thank you so much – yuliansen Dec 23 '19 at 01:47

score 1 · Answer 2 · answered Dec 20 '19 at 09:41

1

With base R, given string s <- "sam buy expensive toys as 125898652", you can use gsub() or strsplit():

> gsub(".*?as\\s","",s)
[1] "125898652

or

> unlist(strsplit(s,split = "(?<=as\\s)",perl = T))[2]
[1] "125898652"

answered Dec 20 '19 at 09:41

ThomasIsCoding

2 Answers2