0

I have this regex to extract reference information from a document:

(?=(((Ard.|Lorr.) *?:|;) *([^;:]{0,100}) (\([0-9/, -]{0,10}/[0-9/, -]{0,10}\)), *([^;]{0,250}?),* *(([0-3]?[0-9]\.)?[01]?[0-9]\.[0-9]{2,4}), *([^;]{0,150}?) *(;|\) ?\.) ( ?ibid\., *(([0-3]?[0-9]\.)?[01]?[0-9]\.[0-9]{2,4}), *([^;]{0,50}) *(;|\) ?\.))*))

My problem is with the last part:

( ?ibid\., *(([0-3]?[0-9]\.)?[01]?[0-9]\.[0-9]{2,4}), *([^;]{0,50}) *(;|\) ?\.))

This part works exactly as expected when tested on its own: https://regex101.com/r/8vAsLz/1

However when using it together with the other part, only the last ibid reference is matched, as seen here: https://regex101.com/r/1AsOtl/1

I'm confused by this behaviour and would like to know why this happens and what I can do to have it work as expected.

charelf
  • 3,103
  • 4
  • 29
  • 51
  • You quantified the group making it a repeated capturing group and it always captures the value matched with its last iteration. – Wiktor Stribiżew Jun 20 '19 at 14:28
  • 1
    Correct. Unless you specify the regex flavor. In C# and Python `regex`, you may easily access all captures, but it will involve some more code. – Wiktor Stribiżew Jun 20 '19 at 14:38

0 Answers0