0

So, my question is fairly simple. I have a string and I want to be able to use it in urls. Simple, right? The tricky part is, however, I want a custom way of encoding it. You see, my language is full of é, í, ô, ä, ľ,š,č,ť..., you get the idea.

So, let's say I have a string like this:

Čečenský bojovník sa pobil v košickej väzbe

If I use HttpUtility.EncodeUrl, I get this string:

%c4%8ce%c4%8densk%c3%bd+bojovn%c3%adk+sa+pobil+v+ko%c5%a1ickej+v%c3%a4zbe

However, my desired string would look like this (trying to have as user-friendly urls as possible):

cecensky-bojovnik-sa-pobil-v-kosickej-vazbe

Using the function EncodeUrl isn't an option then. So, I wrote myself a function to do multiple manipulations to the string, doing exactly what I need.

public static string EncodeForUrl(this string s)
{
    string temp = s.StripDiacritics();  // one custom function
    temp = temp.ToLower();
    temp = temp.Trim();
    temp = temp.Replace(" ", "-");
    return temp;
}

I think it's obvious what's going on and it works perfectly fine. Well, except the fact a string is immutable, so there's quite a lot of unnecessary memory allocations going on.

So finally I got to my question - is there some recommended, more efficient way, of doing this?

walther
  • 13,466
  • 5
  • 41
  • 67
  • have you tried this http://stackoverflow.com/questions/3769457/how-can-i-remove-accents-on-a-string – Ray Cheng May 26 '12 at 07:11
  • Take a look here: http://stackoverflow.com/questions/2920744/url-slugify-alrogithm-in-c –  May 26 '12 at 07:15
  • @RayCheng, that is about removing diacritics, that I already have implemented a function for. I don't really need more functions to do that, mine is working fine. My question is about another thing - what is the best practice to achieve the encoding format of a string that I need. Thanks for the comment tho... – walther May 26 '12 at 07:20
  • @renamr, that seems to be doing it the exact same way I'm doing it at the moment, just introducing regex to the game. It has even more memory allocations going on. Would be regex much faster than string manipulations I'm already using in my code? – walther May 26 '12 at 07:24

2 Answers2

0

You can skip ToLower() and instead of using Replace() you can do something similar to this: https://stackoverflow.com/a/5203674/730701

Community
  • 1
  • 1
Adam
  • 26,549
  • 8
  • 62
  • 79
0

After some googling I finally found an answer that satisfied my needs. The way Stackoverflow handles the situation is probably the best.

How does Stack Overflow generate its SEO-friendly URLs?

and this for stripping diacritics, even better than my current version

https://meta.stackexchange.com/questions/7435/non-us-ascii-characters-dropped-from-full-profile-url/7696#7696

Community
  • 1
  • 1
walther
  • 13,466
  • 5
  • 41
  • 67