-1

I am trying to match 'http://' and 'https://' exactly so that I can remove them from URLS, although I'm having some trouble as it is also matching letters within the URL itself.

Why is this and how can I fix it?

enter image description here

enter image description here

AKL012
  • 399
  • 5
  • 14
  • Why? Because `[^`…`]` indicates a negated character class—every character that is neither `h`, `t`, `p`, `s`, `:`, `/` nor `$` is matched. Why not just match `/https:\/\//` and remove it? – Sebastian Simon Aug 22 '19 at 02:23
  • Yes good point, what if I want to look for and replace both https:// and http:// ? – AKL012 Aug 22 '19 at 02:25
  • Use `/https?:\/\//`, do not cram the regex pattern with constructs you do not need. Study [character classes](https://www.regular-expressions.info/charclass.html) by all means. – Wiktor Stribiżew Aug 22 '19 at 07:16

3 Answers3

0

The regex [^https://$] means:

Match any single character not present in the list "htps:/$"

Andie2302
  • 4,825
  • 4
  • 24
  • 43
0

The regex you have means

 [^http://$]

Match anything except h,t,p,:,/,$

You can simply use URL api to get host name and if you want to replace only http or http you can use replace

let urls = ['http://example.com/123', 'https://examples.com', 'example.com']

// to get hostname
urls.forEach(url => {
  if (/^https?:\/\//i.test(url)) {
    let parsed = new URL(url)
    console.log(parsed.hostname)
  } else {
    console.log(url)
  }
})

// to remove http or https
urls.forEach(url => {
  let replaced = url.replace(/^https?:\/\//i, '')
  console.log(replaced)
})
Code Maniac
  • 37,143
  • 5
  • 39
  • 60
0

As others have answered, [^https://$] doesn't work because [^] isn't a capture group asserting start-of-line, it's a negated character class. Your regex matches any character that is not one of the letters h, t, p, s, :, / literally.

The [brackets] describe a character class, while the (parenthesis) describe a capture group - probably what you were looking for. You can learn more about them in this excellent answer.

It looks a bit like you were trying to use the ^ and $ symbols, but that's not a good idea for your particular regex. This would have asserted the start-of-line was before h, and the end-of-line was after /, meaning the regex would not match unless https:// was the only thing in the string.

If you'd like to match http:// and https://, this regex will do the trick: (https{0,1}:\/\/)

BREAKDOWN

(https{0,1}:\/\/)


(               )    capture this as a group
 http                match "http"
     s{0,1}          match 0 or 1 "s"
           :         match ":"
            \/\/     match "//" literally

Try it here!

If you'd like to match characters like () and -, you can do so by escaping them, too:

\(\)\-    matches "()-" literally

Good luck!

Nick Reed
  • 4,989
  • 4
  • 17
  • 37