0

"@" is certainly allowed in this case:

However, in theses cases:

Are they 'ok' or should the "@" be encoded? Similarly, if they are 'ok', does it make the host portion "bar" and "b.com" respectively?

I took a look at the rfc (http://www.ietf.org/rfc/rfc3986.txt) and page 45 uses this example:

ftp://cnn.example.com&story=breaking_news@10.0.0.1/top_story.htm

to indicate that the "@" means "10.0.0.1" is the host, but I'm not sure because the query portion didn't start correctly (no "?"). (Also it then mentions "attacks" and I got confused.)

The background: I am trying to determine if Steven Levithan's regex is correct in parsing "http://www.foo.com/@bar" as having a host of "bar": http://stevenlevithan.com/demo/parseuri/js/

Sam Adams
  • 5,327
  • 2
  • 18
  • 17
  • i would encode it just to be safe. – Daniel A. White Jun 19 '14 at 11:01
  • possible duplicate of [At (@) symbol inside URLs](http://stackoverflow.com/questions/19509028/at-symbol-inside-urls) – unor Jun 19 '14 at 21:31
  • I think you are right and it is a duplicate. (Such a difficult one to google for!) However, http://stackoverflow.com/questions/19509028/at-symbol-inside-urls has conflicting answers, so maybe this has some use. – Sam Adams Jun 20 '14 at 08:40

1 Answers1

1

The example you are mentioning is used in the RFC to illustrate how a URI like that can be deceiving to humans. In this case cnn.example.com&story=breaking_news would be the user info portion of the URI, in the same way as user:pass is in your first example.

As far as whether or not @ is allowed in the URI itself, as far as I can tell it is.

If you look at pages 48 and 49 you'll find (among other things) the following rules:

URI           = scheme ":" hier-part [ "?" query ] [ "#" fragment ]
hier-part     = "//" authority path-abempty / *snip*
authority     = [ userinfo "@" ] host [ ":" port ]
path-abempty  = *( "/" segment )
segment       = *pchar
pchar         = unreserved / pct-encoded / sub-delims / ":" / "@"

Applying this to http://www.foo.com/@bar we find that scheme is http. authority does only contain the mandatory host portion which is www.foo.com (userinfo and port are both optional). Together with this authority component hier-part has a path-abempty component which consits of a single repetition /@bar. The segment consists of 4 repetitions of pchar: @, b, a and r. As such bar is not the hostname.

How well any given browser and or webserver follows the RFC on the other hand is an entierly different question.

Disclaimer: I am no expert, and it's been a while since I looked at ABNF in general.

Community
  • 1
  • 1
rvalvik
  • 1,559
  • 11
  • 15