3

I am trying to create a regex which matches a key value pair unless it has a hyphen in the beginning. This regex will detect the attribute set from a yaml file. Here is a sample yaml file content

containers:
  - name: api-php-container
    image: us-west-2.amazonaws.com/abcd:45
    ports:
      - containerPort: 80
      - containerPort: 443
    volumeMounts:
        mountPath: "/etc/keys/ssl"
      - name: certs

The regex should match all lines which are like key:value pairs unless it has an hyphen in the beginning. For example, it will match the following lines:

image: us-west-2.amazonaws.com/abcd:45
mountPath: "/etc/keys/ssl"

Here is the regex I wrote:

^(\s*\-\s*)([\w0-9_\-\.]+)\s*:\s*([\w0-9_\-\.\/]+)\s*$

But this detects the lines which starts with the hyphen.

Then I tried using negative lookahead, but then it stopped matching the whole thing altogether. Here that regex:

^(?!(\s*\-\s))([\w0-9_\-\.]+)\s*:\s*([\w0-9_\-\.\/]+)\s*$

How do I make it detect like I want it?

defiant
  • 3,161
  • 11
  • 41
  • 65
  • Just curious. Why do you not want the lines starting with hyphen? If it has a colon then it is a key as important as any others whether it is preceded by a hyphen or not. – blhsing Aug 27 '21 at 05:49
  • @blhsing I want to be able to differentiate between the key value pairs and the key value pairs which starts with hyphen. – defiant Aug 27 '21 at 05:53
  • You did not appear to have made any attempt to exclude a hyphen in your regex. Instead, you explicitly match a hyphen in your regex. – blhsing Aug 27 '21 at 05:53
  • @blhsing I tried using `^(?!(\s*\-\s))([\w0-9_\-\.]+)\s*:\s*([\w0-9_\-\.\/]+)\s*$` but that didn't match anything. – defiant Aug 27 '21 at 05:54
  • 1
    There really is no such thing as a key value pair that starts with a hyphen. That hyphen simply means a list in YAML. It has nothing to do with the key value pairs themselves. – blhsing Aug 27 '21 at 05:54
  • I didn't know what it was called. Thanks. – defiant Aug 27 '21 at 05:55
  • 1
    OK that regex looks like a more serious attempt in implementing the desired behavior. I would recommend that you update your question with that regex instead. – blhsing Aug 27 '21 at 05:55

3 Answers3

2

You might also make sure that the first character you match is not a - : or a whitespace char:

^ *([^\s:-][^\s:]*) *: *(.+)$

In parts, the pattern matches:

  • ^ Start of string
  • * Optional spaces
  • ( Capture group 1
    • [^\s:-] Match a single char other than a whitspace char : or -
    • [^\s:]* Match optional chars other than a whitespace char or :
  • ) Close group 1
  • *: * Match : between optional spaces
  • (.+) Capture group 2, match 1+ times any character
  • $ End of string

Regex demo

The fourth bird
  • 154,723
  • 16
  • 55
  • 70
1

This should work

(?<!-\s)\b.+:.+

Explanation

(?<!-\s): negative lookbehind (?<!...) which finds strings which are not preceded by ..., in this case by "-" and a whitespace character \s

\b: word boundary to make sure that what follows is the beginning of a "word" or: between a whitespace and a character (see this answer)

.+:.+: match at least one (+) character that is not a newline (.), followed by ":" and then, again, match at least one (+) character (.)

leomfn
  • 155
  • 1
  • 11
1

Your attempt with the negative lookahead is fairly close, but since your regex starts with an anchor of ^ it must match all the characters from the beginning of a line, including whitespaces, which means that you should place a \s* after your negative lookahead assertion. I would use a simple white space instead of \s here though, since \s also matches newline characters and may match a key-value pair over multiple lines. Use [^:]+ to match a key and \S.*? to match a value to be more generic:

^(?! *-) *([^:]+) *: *(\S.*?) *$

Demo: https://regex101.com/r/GXDGX3/1

blhsing
  • 91,368
  • 6
  • 71
  • 106