2

I have regex string data but would like to exclude a substring

dat <- c('long_regex_other_stuff','long_regex_other_random.something')
(dat[grep('long_regex',dat)])
(dat[grep('long_regex.*(?!.*something$)',dat)])

The first grep output is expected

"long_regex_other_stuff"            "long_regex_other_random.something"

How to get the second grep to work? The desired output is

"long_regex_other_stuff"

Ref: Regular expression to match a line that doesn't contain a word?

Community
  • 1
  • 1

1 Answers1

1

You need to remove the preceding .* before the string something in the regex and add it after the negative lookahead,

> dat <- c('long_regex','long_regex.something')
> (dat[grep('long_regex(?!.*something).*',dat, perl=T)])
[1] "long_regex"
> (dat[grep('long_regex(?!.*\\bsomething\\b).*',dat, perl=T)])
[1] "long_regex"

long_regex(?!.*something) negative lookahead present in this regex asserts that there isn't a string something present after to the substring long_regex.

> dat <- c('long_regex_other_stuff','long_regex_other_random.something')
> (dat[grep('long_regex(?!.*\\bsomething\\b).*',dat, perl=T)])
[1] "long_regex_other_stuff"
Avinash Raj
  • 172,303
  • 28
  • 230
  • 274
  • Let me check this answer on the "actual" data... This doesn't quite work, I'll change the example... –  Nov 04 '14 at 13:42
  • could you explain the reason? So that, we could provide an exact answer. – Avinash Raj Nov 04 '14 at 13:49
  • There's random characters between the long_regex and the ".something", that were not in the original example –  Nov 04 '14 at 13:53
  • I get the error "reason 'Invalid regexp'" when I apply it to the data. Give me a few minutes to either update the example (again) or to correct the application... –  Nov 04 '14 at 13:56
  • i don't know how it gives invalid regex. Are you copied the exact regex i provided? Did you enable `perl=TRUE` parameter? – Avinash Raj Nov 04 '14 at 13:57
  • Yes, it works, I did leave off the perl=TRUE parameter. –  Nov 04 '14 at 13:59
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/64251/discussion-between-user3969377-and-avinash-raj). –  Nov 04 '14 at 15:14