2

In my code behind in C# I have the following code. How do I change the replace so that only the first occurance of www is replaced? For example if the User enters www.testwww.com then I should be saving it as testwww.com. Currently as per the below code it saves as www.com (guess due to substr code). Please help. Thanks in advance.

private string FilterUrl(string url)
    {
        string lowerCaseUrl = url.ToLower();
        lowerCaseUrl = lowerCaseUrl.Replace("http://", string.Empty).Replace("https://", string.Empty).Replace("ftp://", string.Empty);
        lowerCaseUrl = lowerCaseUrl.Replace("www.", string.Empty);

        string lCaseUrl = url.Substring(url.Length - lowerCaseUrl.Length, lowerCaseUrl.Length);
        return lCaseUrl; 
    }
Ditty
  • 521
  • 7
  • 24
  • what it the url passed in is 'testwww.com' - do you still want to remove the 'first' www? – E.J. Brennan Dec 07 '12 at 19:33
  • 4
    Using the built-in System.Uri class is going to solve a lot of your problems, I think. Don't try to rebuild the machine. – Ally Dec 07 '12 at 19:36

5 Answers5

3

As Ally suggested. You are much better off using System.Uri. This also replaces the leading www as you wish.

private string FilterUrl(string url)
{
    Uri uri = new UriBuilder(url).Uri; // defaults to http:// if missing
    return Regex.Replace(uri.Host, "^www.", "") + uri.PathAndQuery;
}

Edit: The trailing slash is because of the PathAndQuery property. If there was no path you are left with the slash only. Just add another regex replace or string replace. Here's the regex way.

return Regex.Replace(uri.Host, "^www.", "") + Regex.Replace(uri.PathAndQuery, "/$", "");
ryan
  • 6,541
  • 5
  • 43
  • 68
  • And if they want to return rest of the Uri it should probably be `return Regex.Replace(uri.Host, "^www.", "") + uri.PathAndQuery;` – Ally Dec 07 '12 at 19:53
  • @ryan -- It is adding a / to my URL. Is that expected? How do I remove that? I dont want to add a / at the end of my URL. For example www.testwww.com saves as testwww.com/ – Ditty Dec 07 '12 at 22:19
  • @ryan - How do I make it case sensitive? – Ditty Dec 07 '12 at 23:56
  • What do you want to be case sensitive? – ryan Dec 08 '12 at 02:21
  • @ryan - When I try to give my host URL it gives me security warning. How to get past that in the URI? – Ditty Dec 10 '12 at 18:25
  • If you are trying to change `Host` as part of the object you must do it with `UriBuilder` before you start working with the constructed `Uri` as `Host` is get-only. See [this post](http://stackoverflow.com/questions/479799/c-net-replace-host-in-uri) or [this site](http://www.dotnetperls.com/uribuilder) for more info – ryan Dec 11 '12 at 12:39
0

The Replace method will change all content of the string. You have to locate the piece you want to remove using IndexOf method, and remove using Remove method of string. Try something like this:

//include the namespace
using System.Globalization;


private string FilterUrl(string url)
{
    // ccreate a Comparer object.
    CompareInfo myCompare = CultureInfo.InvariantCulture.CompareInfo;

    // find the 'www.' on the url parameter ignoring the case.
    int position = myCompare.IndexOf(url, "www.", CompareOptions.IgnoreCase);

    // check if exists 'www.'  on the string.
    if (position > -1)
    {
      if (position > 0)
         url = url.Remove(position - 1, 5);
      else
         url = url.Remove(position, 5);
    }

    //if you want to remove http://, https://, ftp://.. keep this line
    url = url.Replace("http://", string.Empty).Replace("https://", string.Empty).Replace("ftp://", string.Empty);

    return url;
}   

Edits

There was a part in your code that is removing a piece of string. If you just want to remove the 'www.' and 'http://', 'https://', 'ftp://', take a look the this code.

This code also ignore the case when it compares the url parameter and what you have been findind, on case, 'www.'.

Felipe Oriani
  • 37,948
  • 19
  • 131
  • 194
  • I am trying to test your code out. But What happens when the user enters only testwww.com? Would it remove the www and save it as test.com? – Ditty Dec 07 '12 at 19:47
  • Yes. It will remove, because it does not consider a valid url, just a string, as you pass to your method. – Felipe Oriani Dec 07 '12 at 19:49
  • If you want to ensure that is a valid url, try out the @ryan method. – Felipe Oriani Dec 07 '12 at 19:50
  • Orinani -- not sure what I am doing wrong. But looks like when I give testwww.com the substring part of the code is stripping something and it is saving only as www.com. Can you help me out please? I need the substring part to maintain the user input case, we need to use the Original URL. – Ditty Dec 07 '12 at 22:10
  • Hi Ditty, I did some changes in your original code, and I remove the final part of the method. Look my edits. – Felipe Oriani Dec 07 '12 at 22:18
  • The last part in my code was actually keeping the cases as is i.e. if uppercase then in uppercase and lowercase in lowercase. For example if tEstWWW.com was written the ToLower will change it to testwww.com, if I am not mistaken. So I need to keep the cases intact. With this it does not. Also I am getting some syntax error with the var :-( – Ditty Dec 07 '12 at 22:37
  • let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/20777/discussion-between-ditty-and-felipe-oriani) – Ditty Dec 07 '12 at 22:39
  • I get error when I try to enter www.Testwww.com. As the position becomes 0 and so fails in url = url.Remove(position - 1, 5); – Ditty Dec 08 '12 at 01:13
  • Help me please.. I am not able to understand how to fix the position part. Thanks – Ditty Dec 10 '12 at 18:16
0

I would suggest using indexOf(string) to find the first occurrence.

Edit: okay someone beat me to it ;)

Peter Rasmussen
  • 16,474
  • 7
  • 46
  • 63
0

You could use IndexOf like Felipe suggested OR do it the low tech way..

 lowerCaseUrl = lowerCaseUrl.Replace("http://", string.Empty).Replace("https://", string.Empty).Replace("ftp://", string.Empty).Replace("http://www.", string.Empty).Replace("https://www.", string.Empty)

Would be interested to know what you're trying to achieve.

Bill Martin
  • 4,825
  • 9
  • 52
  • 86
0

Came up with a cool static method, also works for replacing the first x occurrences:

public static string ReplaceOnce(this string s, string replace, string with)
{
    return s.ReplaceCount(replace, with);
}

public static string ReplaceCount(this string s, string replace, string with, int howManytimes = 1)
{
    if (howManytimes < 0) throw InvalidOperationException("can not replace a string less than zero times");
    int count = 0;
    while (s.Contains(replace) && count < howManytimes)
    {
        int position = s.IndexOf(replace);
        s = s.Remove(position, replace.Length);
        s = s.Insert(position, with);
        count++;
    }
    return s;
}

The ReplaceOnce isn't necessary, just a simplifier. Call it like this:

string url = "http://www.stackoverflow.com/questions/www/www";

var urlR1 - url.ReplaceOnce("www", "xxx");
// urlR1 = "http://xxx.stackoverflow.com/questions/www/www";


var urlR2 - url.ReplaceCount("www", "xxx", 2);
// urlR2 = "http://xxx.stackoverflow.com/questions/xxx/www";

NOTE: this is case-sensitive as it is written

naspinski
  • 34,020
  • 36
  • 111
  • 167