20

I am trying to match a simple domain: example.com

But all combinations of it.

How would I do this to cover:

https://example.com
http://www.example.com
etc.
Ates Goral
  • 137,716
  • 26
  • 137
  • 190
Andy
  • 18,723
  • 12
  • 46
  • 54
  • 1
    Do you have to use a single regex? Using an existing URL parser and then looking at the parts individually would be less error prone. – mu is too short Jan 12 '12 at 05:18
  • @muistooshort well originally I had `/^https?:\/\/.*?\.?facebook\.com\//` but thought it might not work in all cases ? – Andy Jan 12 '12 at 05:19
  • That would let some invalid URLs through (such as `http://a_b.facebook.com/`) but that might not be a problem. – mu is too short Jan 12 '12 at 05:54
  • What did you try? What issues did you have worth your attempt. This is not a tutorial sites, or a free code writing service. Please read https://StackOverflow.com/help/how-to-ask, then edit your Q to bring it into compliance with Q types SO is designed to handle. – SherylHohman Aug 14 '20 at 13:23

6 Answers6

14
^https?://([\w\d]+\.)?example\.com$

using code:

var result = /^https?:\/\/([a-zA-Z\d-]+\.){0,}example\.com$/.test('https://example.com');
// result is either true of false

I improved it to match like "http://a.b.example.com"

itea
  • 444
  • 3
  • 5
  • 1
    You might want to adjust that character class. Not all `\w` are valid in a domain name (`_` in particular) and hyphens *are* allowed. – mu is too short Jan 12 '12 at 05:17
  • @itea - thanks - but can you add in javascript escapes for me ? – Andy Jan 12 '12 at 05:17
  • @itea thanks again for your help - but i'm still having probs. see http://jsfiddle.net/v4WVU/ – Andy Jan 12 '12 at 05:36
  • @Andy Because $match the end of a string/line, so just remove it. use this: if (/^(?:http(?:s)?:\/\/)?(?:[^\.]+\.)?jsfiddle\.net/.test(window.location.href)) { alert('works'); } // This seems doesn't work in jsfiddle but it works in chrome javascript console. – itea Jan 12 '12 at 05:43
  • 1
    Hello! While this code may solve the question, including an explanation of how and why this solves the problem would really help to improve the quality of your post, and probably result in more up-votes. Remember that you are answering the question for readers in the future, not just the person asking now. Please edit your answer to add explanations & give an indication of what limitations & assumptions apply. If the Q has already been answered, a link to that SO post should be written in the comments , or report the post as a duplicate, instead of re-answering for SO to function as designed – SherylHohman Aug 14 '20 at 13:24
13

You can probably use to just match the domain name part of a URL:

/^(?:https?:\/\/)?(?:[^.]+\.)?example\.com(\/.*)?$

It will match any of following strings:

https://example.com
http://www.example.com
http://example.com
https://example.com
www.example.com
example.com

 

RegEx Demo

RegEx Details:

  • ^: Start
  • (?:https?:\/\/)?: Match http:// or https://
  • (?:[^.]+\.)?: Optionally Match text till immediately next dot and dot
  • example\.com: Match example.com
  • (\/.*)?: Optionally Match / followed by 0 or more of any characters
  • $: End
anubhava
  • 761,203
  • 64
  • 569
  • 643
  • Actually it works, see there: http://jsfiddle.net/v4WVU/3/ You were trying to match `location.href` which was `http://fiddle.jshell.net/_display/`. As I wrote above RegEx is for matching domain name name not full URL. – anubhava Jan 12 '12 at 06:29
  • 2
    Hello! While this code may solve the question, including an explanation of how and why this solves the problem would really help to improve the quality of your post, and probably result in more up-votes. Remember that you are answering the question for readers in the future, not just the person asking now. Please edit your answer to add explanations & give an indication of what limitations & assumptions apply. If the Q has already been answered, a link to that SO post should be written in the comments , or report the post as a duplicate, instead of re-answering for SO to function as designed – SherylHohman Aug 14 '20 at 13:25
  • 2
    Very valid point @SherylHohman and apologies of not adding an explanation when I had posted this. I have now added a working demo and explanation in my answer. Thanks! – anubhava Aug 14 '20 at 14:13
6

A more generic example I used:

/http(?:s)?:\/\/(?:[\w-]+\.)*([\w-]{1,63})(?:\.(?:\w{3}|\w{2}))(?:$|\/)/i

Note that this solution doesn't pick up the correct label for 5 character TLDs. Example:

http://mylabel.co.uk

Would be picked up as 'co' instead of 'mylabel', but

http://mylabel.co

would be matched correctly as 'mylabel'. The regex was good enough for me even with this limitation.

Note that the 63 character limit for the label is an RFC specification. Hope this helps anyone looking for the same answer in the future.

Daniel
  • 1,789
  • 17
  • 15
2

The following works in Java,

^(http:|https:|)[/][/]([^/]+[.])*example.com$

and matches your test cases, and doesn't match cases like

http://www.google.com/http://example.com

Brad Parks
  • 66,836
  • 64
  • 257
  • 336
2

This will correctly match the URL for any variation of the below, plus anything after .com

https://example.com
https://www.example.com
http://www.example.com
http://example.com
https://example.com
www.example.com
example.com

Result will be either true or false

const result = /^(http(s)?(:\/\/))?(www\.)?example\.com(\/.*)?$/.test(value); 
K20GH
  • 6,032
  • 20
  • 78
  • 118
1

The below exp matches for http/htpps/ftp in the first part, though it can also match random 5 letter word like ahfzc but that rarely would be case and they would be ignored by later part of the exp

The second part matches for ww/www and the last part matches for any alphanumeric seperated by '.'. And the last part matches for any 3 character like .com,.in,.org etc.

try this

r'[a-z0-9]{0,5}[\:\/]+[w]{0,3}[\.]+[a-z0-9\-]+[\.]+[a-z0-9]{0,3}'
solve it
  • 174
  • 1
  • 6
  • 5
    Hello! While this code may solve the question, including an explanation of how and why this solves the problem would really help to improve the quality of your post, and probably result in more up-votes. Remember that you are answering the question for readers in the future, not just the person asking now. Please edit your answer to add explanations & give an indication of what limitations & assumptions apply. If the Q has already been answered, a link to that SO post should be written in the comments , or report the post as a duplicate, instead of re-answering for SO to function as designed – SherylHohman Aug 14 '20 at 13:18
  • Your regex matches `:://:://...-...`, not it's a valid domain. But it doesn't match `stackoverflow.com` or `example.enterprise` that are valid. – Toto Aug 15 '20 at 09:44
  • correct we can use this to include stackoverflow.com `domain_exp = r'[a-z0-9]{0,5}[\:\/]?[w]{0,3}[\.]?[a-z0-9\-]+[\.]+[a-z0-9]{0,3}'` But this would make it too generic and include lot of false positives. You will have to tweak you solution based on type of data you get, coz you can't get 100% accuracy with re I believe . In my case the above works best. – solve it Aug 15 '20 at 13:34