2

I have a file describing objects in which some properties of the object are optional. For example (color is optional):

type=dog
sex=male
name=wolf
color=brown
type=dog
sex=male
name=bob
type=cat
sex=male
name=tom
color=black
type=dog
sex=female
name=simona
color=white

I'm looking for a regex that gives me a pair of properties for a dog "name" - "color". I'm waiting for something like this:

wolf - brown
bob - 
simona - white

I started with

type=dog[\s\S]*?name=(\w+)[\s\S]*?color=(\w+)

Which gives wrong:

wolf - brown
bob - black
simona - white

Then I made group from color(which gives the same) and added "?" quantifier:

type=dog[\s\S]*?name=(\w+)[\s\S]*?(color=(\w+))?

But, instead of the desired result I lost 2nd group in all matches:

wolf - 
bob - 
simona - 

What's wrong with my expression and how to achieve my goal. Please do not use Lookbehind, Lookahead and Conditionals. VBScript not implement them.

My example on regex101.com

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
Riddick
  • 31
  • 2
  • 2
    Lookaheads are supported in VBScript regex. Try `^type=dog[\s\S]*?^name=(\w+)(?:(?:(?!^type=)[\s\S])*?^color=(\w+))?` with `regex.Multiline = True`. See [this regex demo](https://regex101.com/r/dcQTe2/2). – Wiktor Stribiżew Aug 17 '20 at 10:20
  • Seems, you are absolutely right @WiktorStribiżew. By the way, I'm absolutely astonish with number of your answers. – Riddick Aug 17 '20 at 11:02

1 Answers1

0

Set regex.Multiline = True and use the following regex:

^type=dog[\s\S]*?^name=(\w+)(?:(?:(?!^type=)[\s\S])*?^color=(\w+))?

See the regex demo

Details

  • ^ - start of a line
  • type=dog - a string
  • [\s\S]*? - 0 or more chars as few as possible
  • ^ - start of a line
  • name= - a literal string
  • (\w+) - Group 1: any one or more letters, digits or underscores
  • (?:(?:(?!^type=)[\s\S])*?^color=(\w+))? - an optional non-capturing group matching 1 or 0 occurrences of
    • (?:(?!^type=)[\s\S])*? - any char, 0 or more times, as few as possible, that does not start a type= substring at the start of a line
    • ^color= - color= substring start of a line
    • (\w+) - Group 2: any one or more letters, digits or underscores
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563