6

Can anyone tell me what are the invalid characters for HTTP URL and the best way to validate the same in Java. What I am looking for is URLString validation in the URL format: http(s)://ip:port/URLString

Thanks in advance.

Newbie
  • 2,979
  • 9
  • 34
  • 41

4 Answers4

4

You can use any unicode characters you want, as long as they are % encoded. The explicitly reserved characters are defined in section 2.2 of RFC3986: https://www.rfc-editor.org/rfc/rfc3986#section-2

From the document:

  reserved    = gen-delims / sub-delims

  gen-delims  = ":" / "/" / "?" / "#" / "[" / "]" / "@"

  sub-delims  = "!" / "$" / "&" / "'" / "(" / ")"
              / "*" / "+" / "," / ";" / "="
Community
  • 1
  • 1
Mikola
  • 9,176
  • 2
  • 34
  • 41
1

According to RFC 1738 the following are deemed unsafe:

  1. < and > - delimiters around URLs in free text
  2. " (double quote) - delimits URLs in some systems
  3. - delimits a URL from a fragment/anchor identifier that might follow it

  4. % - used to indicate character encodings

General unsafe characters: { } | \ ^ ~ [ ] `

Edit:

Not a duplicate, but includes some thoughts on validation in Java: Validate URL in java

Community
  • 1
  • 1
Kyle
  • 1,366
  • 2
  • 16
  • 28
0

Read RFC1738 Page 2 and Page 3 on the link for details.

phoxis
  • 60,131
  • 14
  • 81
  • 117
0

How about using UrlValidator ? isValidPath method probably useful. :)

mattn
  • 7,571
  • 30
  • 54