This question focuses on pcre-regular expression as used by grep -P
.
Imagine I have a string abcRabcSxyxz
and search for a substring which starts with abc
and ends with x
, but with the restriction that no shorter substring of this match would also also match.
My first attempt was a non-greedy regexp,
grep -Po 'abc.*?x' <<<abcRabcSxyxz
but this returns abcRabcSx, while I would like to find just abcSx. It is obvious why even my non-greedy attempt still provides a match which is too long; I need the regexp engine to try harder. My second attempt was
grep -Po '(?>abc.*?)x' <<<abcRabcSxyxz
which did not provide a match at all (maybe I don't really understand the usage of ($?...)
explained here).
Any easy solution for my problem anyone?
UPDATE I see from the comments that my example does not precisely explain what i am searching for, so here a more general description:
I am searching for matches of the form PXQ
, wher P, X and Q are arbitrary patterns, and X should not contain a match of P. Plus, I don't want to literally retype the pattern P inside X.
For instance
`[(][^(]*[)]`
would be a possible (but not satsifying) solution for the concrete case that I am searching for a parenthesized expression which does not contain another parenthesized (here, P is [(], X is an arbitrary string, and Q is [)]), but even this example shows that I have to literally repeat the information contained in P, when specifying the middle part ([^(]*), to make sure that my P is not contained there). I am looking for a way which makes this explicit repetition unnecessary.