2

I need a regular expression to validate an absolute URL OR a relative URL.

  • http://www.example.com should pass
  • /example/some.pdf should pass
  • /example.com/some.pdf should fail
  • example should fail
Brad Rhoads
  • 1,828
  • 3
  • 29
  • 52
  • possible duplicate of [What is the best regular expression to check if a string is a valid URL](http://stackoverflow.com/questions/161738/what-is-the-best-regular-expression-to-check-if-a-string-is-a-valid-url) – CrayonViolent Mar 24 '11 at 15:20
  • 3
    All of the examples **are** valid URLs. – well, in fact all but the last are valid **absolute** URLs (you seem to confuse relative, absolute and not having a protocol schema) and only the last is a (valid!) relative URL. In summary, it’s not clear what you’re after. – Konrad Rudolph Mar 24 '11 at 15:20
  • flaw in #4: I'm on page "http://www.somesite.com/blah.html". If I have a link on a page like ... It will take me to "http://www.somesite.com/example" which can be a valid url that ultimately points to "http://www.somesite.com/example/index.html" or anything you may redirect it to with mod_rewrite...and on that note, you can redirect anything to anything with mod_rewrite, so IOW there is no such thing as a strictly "invalid" url – CrayonViolent Mar 24 '11 at 15:25
  • I want to correct @KonradRudolph in their suggestion all but the last url are absolute urls given the two beginning with a slash are known as _root-relative_ urls. – vhs Feb 20 '22 at 16:12
  • @vhs Neither [the official specification](https://url.spec.whatwg.org/) (nor [the original RFC](https://datatracker.ietf.org/doc/html/rfc1738)) makes mention of “root-relative URLs” (and in fact examples 2 & 3 are called “path-absolute-URL strings”). That said, you are right: confusingly, according to the spec, path-absolute-URL strings are *relative* URLs. Surprisingly, [MDN makes the same mistake as I](https://developer.mozilla.org/en-US/docs/Learn/Common_questions/What_is_a_URL#examples_of_absolute_urls). It’s a shame that my previous comment has stood unchallenged for more than 10 years. – Konrad Rudolph Feb 20 '22 at 16:40

2 Answers2

3
(http://|/)[^ :]+

Also of note, /example.com/some is a valid url as a folder can have a period in it, and not all files have extensions.

Depending on the system there could easily be more types of characters to add to the negated brackets.

The easiest way to do this would be to actually request the file metadata from the server to check weather it actually exists but that wouldn't use regex at all.

J V
  • 11,402
  • 10
  • 52
  • 72
1

As Konrad Rudolph correctly points out: ALL of the examples in the OP are valid URLs! And did you know that an empty string is also a valid URL? Its true! If you really want to validate a URL, or shall I say URI, the first document to read is: RFC-3986 - Uniform Resource Identifier (URI): Generic Syntax. I have been looking into this issue a bit and have written (am writing) an article on the subject:

Regular Expression URI Validation

Note that this only deals with a generic URI. You will probably want to look into further requirements specific to the schemes you are interested in (i.e. HTTP), e.g. require that a host name be a vaild DNS host.

Community
  • 1
  • 1
ridgerunner
  • 33,777
  • 5
  • 57
  • 69