1

I want to achieve below results with the help of Regular Expression -

http://articles-test.mer.com --> should not match/accept or return false

http://articles-test.mer.com/ --> should not match/accept or return false

http://articles-test. mer.com/ --> should not match/accept or return false

http://articles-test. mer.com/sites --> should not match/accept or return false

http://articles-test.mer.com/sites --> should match/accept or return true

http://foodfacts.merc.com/green-tea.html --> should match/accept or return true  

http://articles-test.merc.com/sites/abc.aspx --> should match/accept or return true  

Conclusion- In Short if the URL has only domain, it should not match/accept

I've tried with the below expression but it is not working as expected -

^http(s)?://([\w-]+.)+[\w-]+(/[\w- ./?])?$

Please suggest and Thanks in advance!

  • You need to escape the dot: `\w-]+\.`. – MakePeaceGreatAgain Dec 22 '17 at 07:18
  • @HimBromBeere Could you please highlight which part of the expression to add or exclude? –  Dec 22 '17 at 07:22
  • 2
    You've said a lot about what you *don't* want to match but haven't clearly stated what you *do* want to match. Are you sure Regex is the right tool for the job here? It looks like you're trying to work with some form or URIs - could you not `TryCreate` a `Uri` and inspect its various properties? – Damien_The_Unbeliever Dec 22 '17 at 07:31
  • @Damien_The_Unbeliever Already got an answer, thanks! –  Dec 22 '17 at 07:39
  • 1
    That doesn´t mean you shouldn´t bother if your question can be made better. Regardless from any answers your question should always be clear and easy to understand. However your question is quite good, in comparison to other stuff we read here every day. – MakePeaceGreatAgain Dec 22 '17 at 07:40

2 Answers2

2

You only need to escape the dot, as it usually means any single character. The same also applies to the slash. So your regex becomes this:

^http(?:s)?:\/\/(?:[\w-]+\.?)+\/[\w-\.]+(\/[\w-])?$

So the \/\/ literally matches //, whereas \. matches the dot.

I also added some non-capturing group (?:). If you want to get the individual parts, just ommit these two characters.

Check out on regex101

EDIT: I´ve added a \. to the part behind the /, so that you can also match files instead of directories in your URL.

EDIT2: You should definitly consider check if a given string is a valid URL using Uri.TryCreate as shown in this post, instead of re-inventing the wheel with a hard to understand regular expression.

Uri uriResult;
bool result = Uri.TryCreate(myString, UriKind.Absolute, out uriResult) 
    && uriResult.Scheme == Uri.UriSchemeHttp;
MakePeaceGreatAgain
  • 35,491
  • 6
  • 60
  • 111
  • But when i enter this URL `http://foodfacts.mercola.com/green-tea.html` it does not match, infact this should be valid –  Dec 22 '17 at 10:36
  • No match found for this URL -`http://articles-test.merc.com/sites/abc.aspx` and should return true, also updated question –  Dec 22 '17 at 10:57
  • @PPB Could you please stop to permanently add new conditions to your question? That makes it impossible to really *answer* it. What about `http://articles-test.merc.com/sites/subsite/abc.aspx`? Or on dynamic content: `http://articles-test.merc.com/sites/abc.aspx?myparam=1`? You see it gets more and more complex depending on how many partciles you have in your URL. – MakePeaceGreatAgain Dec 22 '17 at 11:08
1

You can use this regex:

^http(s)?://[^/\s]+/.+$
karthik selvaraj
  • 426
  • 5
  • 12