2

I have the following RegEx ^http:\/\/(?!www\.)(.*)$

Expected behavior:

http://example.com - Match
http://www.example.com - Does not match

It looks like golang does not support negative lookahead. How can I rewrite this RegEx to work on golang?

UPDATE

I'm not coding using golang, I'm using Traefik that accepts a Regex (golang flavor) as a config value, so basically I have this:

regex = "^https://(.*)$"
replacement = "https://www.$1"

What I want is to always add www. to the URL, but NOT if the URL has it already, otherwise it would become www.www.*

stefanobaldo
  • 1,957
  • 6
  • 27
  • 40
  • 1
    I know nothing of golang which is why this is a comment, but can't you do a match for `/^http:\/\/www\./` in an if statement and if it doesn't match look for `http://example.com` – JGNI Oct 04 '18 at 13:55
  • Please refer the following [link](https://stackoverflow.com/questions/26771592/negative-look-ahead-go-regular-expressions) for more information – Nareen Babu Oct 04 '18 at 14:08
  • I have the same problem where I'm using re2 as a regex engine without the full expressiveness of Golang (Terraform's `regex()` function.) Trying to use variable validation in Terraform 0.13 to ensure that users don't pass a string that begins or ends with certain words — i.e., regex for does NOT match _string_ (not just characters). – Ryan Parman Oct 03 '20 at 22:34

2 Answers2

4

If you're really bent on creating a negative lookahead manually, you will need to exclude all possible w in the regexp:

^https?://(([^w].+|w(|[^w].*)|ww(|[^w].+)|www.+)\.)?example\.com$

This regexp allows any word with a dot before example.com, unless that word is just www. It does so by allowing any word that does not start with w, or, if it starts with w it is either just that w or followed by a non-w and other stuff. If it starts with two w, then it must be either just that or followed by a non-w. If it starts with www, it must be followed by something.

Demo

The clarification makes this much much easier. The approach is to always (optionally) match www. and then to put that back in the replacement always:

Search:

^http://(?:www\.)?(.*)\b$

Replace:

http://www.$1

Demo 2

Corion
  • 3,855
  • 1
  • 17
  • 27
0

Golang uses the RE2 regex engine, which doesn't support look arounds of any kind.

Since you are dealing with URLs, you can simply parse them and inspect the host part:

package main

import (
    "net/url"
    "strings"
    "testing"
)

func Match(s string) bool {
    u, err := url.Parse(s)
    switch {
    case err != nil:
        return false
    case u.Scheme != "http":
        return false
    case u.User != nil:
        return false
    }

    return !strings.HasPrefix(u.Host, "www.")
}

func TestMatch(t *testing.T) {
    testCases := []struct {
        URL  string
        Want bool
    }{
        {"http://example.com", true},
        {"http://wwwexample.com", true},
        {"http://www.example.com", false},
        {"http://user@example.com", false},
        {"http://user@www.example.com", false},
        {"www.example.com", false},
        {"example.com", false},
    }

    for _, tc := range testCases {
        if m := Match(tc.URL); m != tc.Want {
            t.Errorf("Match(%q) = %v; want %v", tc.URL, m, tc.Want)
        }
    }
}
Peter
  • 29,454
  • 5
  • 48
  • 60
  • Actually I'm not using golang directly, so I can't do that - I need to specify a RegEx (golang flavor) inside Traefik (https://traefik.io) config. – stefanobaldo Oct 04 '18 at 17:05