-1

I just really started learning how to use regex's and i am trying to create one to match urls. So far i have:

(http://|https://|www|\w)+\.[\w]{2,4}[^\s]+

can anyone give me some feedback or advice on how this looks, or maybe point me in a better

NullPoiиteя
  • 56,591
  • 22
  • 125
  • 143
mcbeav
  • 11,893
  • 19
  • 54
  • 84
  • do you have a specific question? – driangle Feb 06 '12 at 18:02
  • 1
    Honestly I would look online for a good one. URLs aren't incredibly complex, but there's always some fringe case you wouldn't think of. – Mike Park Feb 06 '12 at 18:03
  • 1
    possible duplicate of [What is the best regular expression to check if a string is a valid URL?](http://stackoverflow.com/questions/161738/what-is-the-best-regular-expression-to-check-if-a-string-is-a-valid-url) – mario Feb 06 '12 at 18:04
  • well i am just learning regex's and want to know if this is legit, it checks out, but will grab a few extra characters, can anyone make any modifications to this to make it better? – mcbeav Feb 06 '12 at 18:06
  • This question has been asked so many times. See the related questions part of the page on the bottom right. – Madara's Ghost Feb 06 '12 at 18:16

3 Answers3

1

the regex you wrote is a good start, the true power of regex is that it can be very... well... powerful :) there are a lot of differences in the structure of URL that are still considered ok.

consult a Regex Reference to learn more and understand the meaning of all the characters.

and you can check for what the community generates by going to a site like this one

this may be overly complex for what you need but this is a well written regex for URLmatching: ^(((ht|f)tp(s?))\://)?(www.|[a-zA-Z].)[a-zA-Z0-9\-\.]+\.(com|edu|gov|mil|net|org|biz|info|name|museum|us|ca|uk)(\:[0-9]+)*(/($|[a-zA-Z0-9\.\,\;\?\'\\\+&%\$#\=~_\-]+))*$

on a side note i use this online Regex Helper to test my regex strings without the need to actually run them from the code.

BigFatBaby
  • 1,525
  • 9
  • 19
0

Give this one a try: ^(ht|f)tp(s?)\:\/\/[0-9a-zA-Z]([-.\w]*[0-9a-zA-Z])*(:(0-9)*)*(\/?)([a-zA-Z0-9\-\.\?\,\'\/\\\+&%\$#_]*)?$ Note that depending on how you are using it, you might want to remove the ^ and $ from the beginning and end.

JamieSee
  • 12,696
  • 2
  • 31
  • 47
0

Plenty of stuff on the web...

Community
  • 1
  • 1