I have a text extracted from a large PDF file. I am only interested in one part of this text. I only need the part which is present between 2 test
substrings AND which has 1 or more occurrences of a specific word XX12QW
. Out of those 2 test
substrings/words, the first one can be included in the match as shown in the desired output below
Input String:
test
abc def
test 123
test pqr
XX12QW
jkl XX12QW hjas
12asd23 test bxs
Desired Output:
test pqr
XX12QW
jkl XX12QW hjas
12asd23
Things to be noted:
- There are multiple occurrences of the substring
test
. - I need only the part between 2 substrings/words -
test
which contain 1 or more occurrences of the wordXX12QW
. This wordXX12QW
will not be present at all between any other pairs of the word -test
. That is, there will never be a case like this:test abc XX12QW test isadkj XX12QW test an test
- One extra test case would be if the word
XX12QW
is present betweentest
and$(End of string/file)
:- Input:
test absjh123 sjnc test jhsd32 test aabb XX12QW asdj XX12QW sdfk
- Desired Output:
test aabb XX12QW asdj XX12QW sdfk
- Input:
I am stuck on this for a long time now and really need someone else to look at it.
Regex: test[\s\S]*?XX12QW[\s\S]*?(?=test)
Would really appreciate any help.