1

I have a regex problem in which I have to capture these examples of outputs(github.com, medium.com, www.nytimes.com, www.theguardian.com, techcrunch.com) in the links, the problem is some link doesn't have "www" so I this is my regex:

    https?:\/\/([w]{3}?\.?)

What I thought is that, make the www optional but I dont want to capture it. Thanks!

edit: I found the solution!

    https?:\/\/([\w\-\.]+)
MrFapoh
  • 23
  • 5
  • 1
    "Wont work" doesn't tell us what wrong with it. What are you trying to match? What are you trying to capture? Is the problem that it didn't match the `.com` and `.cc` too? Or is the problem that it did match the `.com` on one of them? Is the problem that it is case sensitive on the `http` part? Your question is unclear, so please **edit** the question and clarify what you're actually trying to match. – Andreas Mar 17 '20 at 03:25
  • 1
    While you are at it, please replace that image with actual text, so we can copy/paste your regex and/or the texts to search. [Why not upload images of code on SO when asking a question?](https://meta.stackoverflow.com/q/285551/5221149) – Andreas Mar 17 '20 at 03:25
  • Why not use [`urllib.parse()`](https://docs.python.org/3/library/urllib.parse.html#module-urllib.parse) to parse the URL? – Andreas Mar 17 '20 at 03:41
  • It worked now. Sorry, I just didn't notice the uppercase that is why it does not match. Thanks for pointing it out. – MrFapoh Mar 17 '20 at 03:42
  • You still haven't clarified what it is you want, and you still haven't replaced image with text we can use. --- What about `www.patient.info`, or `www.cancer.patient.info`, or `amazon.com`, or `shop.amazon.co.uk`, just to give a few examples? --- You say you want the "domain names", but the domain name is the full name, from `//` until the next `/`. – Andreas Mar 17 '20 at 04:01

1 Answers1

0

While your suggested solution achieves the desired effect, the suggested way would be to use a non-capturing group

Which would turn this:

https?:\/\/([w]{3}?\.?)

Into this:

https?:\/\/(?:[w]{3}?\.?)
Anastasiosyal
  • 6,494
  • 6
  • 34
  • 40