0

Can someone explain why I get next results in C# (.NET Framework 4.8):

  • Uri.TryCreate("_g._google._com", UriKind.RelativeOrAbsolute, out _) return true
  • Uri.TryCreate("http://_g._google._com", UriKind.RelativeOrAbsolute, out _) return false
  • Uri.TryCreate("http://_g._google.com", UriKind.RelativeOrAbsolute, out _) return true

UPDATED: More cases:

  • Uri.TryCreate("http://_google._com", UriKind.Absolute, out _) return true
  • Uri.TryCreate("http://_g._google._com", UriKind.Absolute, out _) return false

2 Answers2

1

A URI provides a simple and extensible means for identifying a resource, it's nothing more than an identifier and as such it can allow for some characters that are not allowed by URLs as they can be names, locations, or both.

URLS are a subset of URIs which are restricted by the characters they may contain and how those characters are organized. For further information we can reference the RFC.

A URI can be further classified as a locator, a name, or both. The term “Uniform Resource Locator” (URL) refers to the subset of URIs that, in addition to identifying a resource, provide a means of locating the resource by describing its primary access mechanism (e.g., its network “location”).

In essence all URLs are URIs but not all URIs are URLs. URLs not only tell what something is but also tell you how to get to it. There is a good article on the difference of URIs and URLs written by Daniel Miessler.

As such the behavior you are experiencing is accurate, as it doesn't know for a fact that you are trying to creating a legit URL, but regardless your are creating an accurate URI.

In order to detect if it is a valid URL use the method below from this question.

public static bool ValidHttpURL(string s, out Uri resultURI)
{
    if (!Regex.IsMatch(s, @"^https?:\/\/", RegexOptions.IgnoreCase))
        s = "http://" + s;

    if (Uri.TryCreate(s, UriKind.Absolute, out resultURI))
        return (resultURI.Scheme == Uri.UriSchemeHttp || 
                resultURI.Scheme == Uri.UriSchemeHttps);

    return false;
}

Usage:

string[] inputs = new[] {
                          "https://www.google.com",
                          "http://www.google.com",
                          "www.google.com",
                          "google.com",
                          "javascript:alert('Hack me!')"
                        };
foreach (string s in inputs)
{
    Uri uriResult;
    bool result = ValidHttpURL(s, out uriResult);
    Console.WriteLine(result + "\t" + uriResult?.AbsoluteUri);
}

Output:

True    https://www.google.com/
True    http://www.google.com/
True    http://www.google.com/
True    http://google.com/
False

Why do URLs with underscores in them return false from Uri.TryCreate?

Urls/Uris containing underscores will always return false when using Uri.TryCreate. This is due to a modification of the standard

This change required all rule names that formerly included underscore characters to be renamed with a dash instead.

Community
  • 1
  • 1
DCCoder
  • 1,587
  • 4
  • 16
  • 29
  • So, I focused on second case there URL is "http://_g._google._com" and looks like it is valid URL, but invalid URI. Is there a way in C# to check for valid URL, not URI? – Leonid Idelchik Nov 13 '20 at 16:30
  • So, what about next cases? Uri.TryCreate("http://_google._com", UriKind.Absolute, out _) return true, but Uri.TryCreate("http://_g._google._com", UriKind.Absolute, out _) return false? – Leonid Idelchik Nov 16 '20 at 07:07
  • If you add the scheme to it (eg. http or https) it should return true for an absolute URL with a subdomain. – DCCoder Nov 16 '20 at 16:23
  • I see the issue now, looks like the standard was modified. I updated my answer to include an explanation at the bottom. Basically the fact that _google._com returns true seems to be wrong. It appears that any Uri with an underscore should be deemed invalid as the underscores should be replaced with dashes. – DCCoder Nov 16 '20 at 20:00
0

With UriKind.RelativeOrAbsolute, System would try to detect the type of the URI string passes to it as either Ralative or Absolute. If the Uristring are starting with "http://" then system is detecting it as absolute Uri and "http://_g._google._com" is not a valid Uri and thus you are getting false. So for each option it is as:

First Case: Uri.TryCreate("_g._google._com", UriKind.RelativeOrAbsolute, out _), in this system is taking it as relative Url and it is a valid relative Uri thus you get true.

Second case: Uri.TryCreate("http://_g._google._com", UriKind.RelativeOrAbsolute, out _) , in this as it starts with "http://", thus it is absolute Uri but not a valid one and thus You get false.

Third case: Uri.TryCreate("http://_g._google.com", UriKind.RelativeOrAbsolute, out _), same as case 2 but in this case it is a valid Uri as it is missing _ before .com. Thus you get a true.

Rites
  • 146
  • 1