The best way to know whether a given string represents a valid url, without actually testing it and by bearing in mind the comments above (something which might fit within the given schema, but is not what you consider right), is performing a custom analysis. Also, you should replace your bool
function with a string
(or an Uri
) one able to correct certain situations (like the example you propose). Sample code:
private void Form1_Load(object sender, EventArgs e)
{
string rightUrl = returnValidUrl("http://http://www.Google.com");
if (rightUrl != "")
{
//It is OK
}
}
static string returnValidUrl(string urlString)
{
string outUrl = "";
Uri curUri = IsValidUrl(urlString);
if (curUri != null)
{
string headingBit = "http://";
if (curUri.Scheme == Uri.UriSchemeHttps) headingBit = "https://";
if (curUri.Scheme == Uri.UriSchemeFtp) headingBit = "ftp://";
if (curUri.Scheme == Uri.UriSchemeMailto) headingBit = "mailto:";
outUrl = headingBit + urlString.ToLower().Substring(urlString.ToLower().LastIndexOf(headingBit) + headingBit.Length);
}
return outUrl;
}
static Uri IsValidUrl(string urlString)
{
Uri uri = null;
bool isValid = Uri.TryCreate(urlString, UriKind.Absolute, out uri)
&& (uri.Scheme == Uri.UriSchemeHttp
|| uri.Scheme == Uri.UriSchemeHttps
|| uri.Scheme == Uri.UriSchemeFtp
|| uri.Scheme == Uri.UriSchemeMailto
);
if (!isValid) uri = null;
return uri;
}
What can be called with:
string rightUrl = returnValidUrl("http://http://www.Google.com");
if (rightUrl != "")
{
//It is OK
}
You would have to extend this method to recognise as valid/correct all the situations you need.
UPDATE
As suggested via comments and, in order to deliver the exact functionality the OP is looking for (a sample of it; as far as the proposed solution is just an example of the type of casuistic approach, which this problem requires), here you have a corrected bool
function considering the posted example wrong:
static bool IsValidUrl2(string urlString)
{
Uri uri;
return Uri.TryCreate(urlString, UriKind.Absolute, out uri)
&& ((uri.Scheme == Uri.UriSchemeHttp && numberOfBits(urlString.ToLower(), "http://") == 1)
|| (uri.Scheme == Uri.UriSchemeHttps && numberOfBits(urlString.ToLower(), "https://") == 1)
|| (uri.Scheme == Uri.UriSchemeFtp && numberOfBits(urlString.ToLower(), "ftp://") == 1)
|| (uri.Scheme == Uri.UriSchemeMailto && numberOfBits(urlString.ToLower(), "mailto:") == 1)
);
}
static int numberOfBits(string inputString, string bitToCheck)
{
return inputString.ToLower().Split(new string[] { bitToCheck.ToLower() }, StringSplitOptions.None).Length - 1;
}
CLARIFICATION
The only way to be completely sure that a given url is valid or not is actually testing it; but the OP said no connections what I understood as pure string analysis: exactly what this answer is about. In any case, as explained via comments, the intention of this post is just showing the way through: .NET + custom algorithm (by understanding that aiming overall-applicability by relying on string analysis is pretty difficult); my proposal accounts for the specific problem explained by the OP (duplicated "heading parts") and by relying on his conditions. It cannot be understood as a generally-applicable, blindly-usable approach at all; but as a general framework with a sample functionality (a mere proof of concept).
CLARIFICATION 2
As shown in the conversation with Jon Hanna in the comments below, there is a third alternative I wasn't aware of: analysing the to-be IP address (i.e., numbers already put together, but IP address availability not checked yet and thus definitive IP address generation not started); by looking at it, it would also be possible to determine the likelihood of a given string to be a valid URL address (under the expected conditions). In any case, this cannot be considered as a 100% reliable process either, as far as the IP address being analysed is not the definitive one. In any case, Jon Hanna is in a much better position than myself to talk about the limitations of this alternative.