2

(modified) Trying to get only first match of condition (?<=Location:.*?\().*?(?=\))

Here is data:

--batchresponse_bla_bla_bla_\r\n--changesetresponse__bla_bla_bla_\r\nLocation: https://site.ru/CRM/api/data/v9.0/gm_preorders(a341eb4e-2fdf-eb11-a30b-ac1f6b465e3b)\r\nOData-EntityId: https://site.ru/CRM/api/data/v9.0/gm_preorders(a341eb4e-2fdf-eb11-a30b-ac1f6b465e3b)\r\n_bla_bla_bla_\r\n--changesetresponse__bla_bla_bla_Location: https://site.ru/CRM/api/data/v9.0/gm_preorders(a841eb4e-2fdf-eb11-a30b-ac1f6b465e3b)\r\nOData-EntityId: https://site.ru/CRM/api/data/v9.0/gm_preorders(a841eb4e-2fdf-eb11-a30b-ac1f6b465e3b)\r\n_bla_bla_bla_\r\n--changesetresponse_n_bla_bla_bla_\r\nLocation: https://site.ru/CRM/api/data/v9.0/gm_preorders(74748d08-2ee6-eb11-a30b-ac1f6b465e3b)\r\nOData-EntityId: https://site.ru/CRM/api/data/v9.0/gm_preorders(74748d08-2ee6-eb11-a30b-ac1f6b465e3b)\r\nn_bla_bla_bla_\r\n--changesetresponse_etc

and it returns:

match 1:    a341eb4e-2fdf-eb11-a30b-ac1f6b465e3b
match 2:    a341eb4e-2fdf-eb11-a30b-ac1f6b465e3b
match 3:    a841eb4e-2fdf-eb11-a30b-ac1f6b465e3b
match 4:    a841eb4e-2fdf-eb11-a30b-ac1f6b465e3b
match 5:    74748d08-2ee6-eb11-a30b-ac1f6b465e3b
match 6:    74748d08-2ee6-eb11-a30b-ac1f6b465e3b

Is there a possibility to match only first occurrence of each match (so i need 3 matches: 1, 3 and 5) with lookbehind and lookahead and without grouping or other conditions?

Found solution with a help:

(?<=Location:[^(]*?\().*?(?=\))
kmish
  • 33
  • 5

2 Answers2

2

You may use

(?<=Location:[^(]*\([^(]*\()[^)]*(?=\))
(?<=Location:[\w\W]*?\()(.*?)(?=\))(?![\w\W]*\1)

See the regex demo #1 and regex #2 demo.

Details:

  • (?<=Location:[^(]*\([^(]*\() - a location preceded with Location:, zero or more chars other than (, a (, and then again zero or more chars other than ( and then a (
  • [^)]* - zero or more chars other than )
  • (?=\)) - a ) char must appear immediately on the right.
  • (?<=Location:[\w\W]*?\() - a positive lookbehind that matches a location that is immediately preceded with
    • Location: - a Location: string
    • [\w\W]*? - zero or more chars as few as possible
    • \( - a ( char
  • (.*?) - Group 1: zero or more chars other than line break chars, as few as possible
  • (?=\)) - immediately to the right, there must be a ) char.
  • (?![\w\W]*\1) - no Group 1 value cannot be located further in the string.
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
  • hi, i've changed question cause it wasnt full enough. And my engine cant get "(?s:.)" expression. – kmish Jul 17 '21 at 07:45
  • @kmish My regex still works. Did you try it? What is your regex engine? If it supports `\A`, it supports `(?s:...)`. Please check the other solution I have just added. – Wiktor Stribiżew Jul 17 '21 at 07:47
  • first code doesnt work - https://regex101.com/r/JRij0x/1, second one founds 6 matches as my original one – kmish Jul 17 '21 at 07:53
  • i need not only first but 1st, 3rd and 5th, etc. Like each time lookahead must stop after first `)`, but it captures both occurrences following before. – kmish Jul 17 '21 at 07:57
  • doesnt - https://regex101.com/r/TFT05l/1, do not change \r\n to new line. original is single line string – kmish Jul 17 '21 at 08:06
  • @kmish `\r\n` are line endings, CRLF. There are no `\r\n` in the plain text. Try the regex in your original environment. Do not use regex101 now. Read [My regex works at regex101.com, but not in](https://stackoverflow.com/a/39636208/3832970). If you do not explain where and how you use the regex, you might never solve this problem. – Wiktor Stribiżew Jul 17 '21 at 08:07
  • @kmish Read [My regex works at regex101.com, but not in](https://stackoverflow.com/a/39636208/3832970). – Wiktor Stribiżew Jul 17 '21 at 08:08
  • anyway it doesnt work on original line as well as solution becomes to complex, easier to filter it afterward – kmish Jul 17 '21 at 08:09
  • @kmish Of course it is much easier to solve without a regex, or a combination of a simple regex + a bit of code. Where are you using the regex? Btw, also, see `(?<=Location:[^(]*\([^(]*\()[^)]*(?=\))` [demo](https://regex101.com/r/PlVnxP/5).. This is more a workaround but is probably sufficient in this case. – Wiktor Stribiżew Jul 17 '21 at 08:09
  • @kmish Added this workaround solution to the answer. – Wiktor Stribiżew Jul 17 '21 at 08:15
  • with your last example modifing my original `(?<=Location:.*?\().*?(?=\))` with `[^(]*?` is making `(?<=Location:[^(]*?\().*?(?=\))` and works like a charm, simple and fast. thanks for help – kmish Jul 17 '21 at 08:24
2

You may use this dynamic length lookbehind assertion in a regex without using MULTILINE mode:

(?<=^(?:(?!\bLocation:)[^])*?\bLocation:[^(]*\()[^)]+

RegEx Details:

  • (?<=: Start lookbehind condition
    • ^: Start position
    • (?:(?!\bLocation:)[\s\S])*?: Match 0 or more of any character including newline as long as it is not followed by Location: word
    • \bLocation:: Match word Location:
    • [^(]*\(: Followed by 0 or more non-( characters and a (
  • ): End lookbehind condition
  • [^)]+: Match 1+ of any character that is not a )
anubhava
  • 761,203
  • 64
  • 569
  • 643