4

We are making our site an SEO-friendly site by following the pattern below:

http://OurWebsite.com/MyArticle/Math/Spain/Glaño

As you see, Glaño has a spelling character that search engines may not like it. On the other hand we cannot build up the last URL!

Any suggestions to maintain our current URL generation code to handle Spanish or French entries or we need to change our approach?

tshepang
  • 12,111
  • 21
  • 91
  • 136
Arash
  • 3,628
  • 5
  • 46
  • 70
  • Here's a related post: http://stackoverflow.com/questions/1858426/aao-what-is-considered-more-seo-friendly-url – o.k.w Jun 22 '10 at 02:26
  • See also this question: http://stackoverflow.com/questions/140549/what-character-set-should-i-assume-the-encoded-characters-in-a-url-to-be-in – Nick Johnson Jun 22 '10 at 09:46

3 Answers3

6

Try these functions:

function Slug($string, $slug = '-', $extra = null)
{
    return strtolower(trim(preg_replace('~[^0-9a-z' . preg_quote($extra, '~') . ']+~i', $slug, Unaccent($string)), $slug));
}

function Unaccent($string)
{
    return html_entity_decode(preg_replace('~&([a-z]{1,2})(?:acute|cedil|circ|grave|lig|orn|ring|slash|th|tilde|uml);~i', '$1', htmlentities($string, ENT_QUOTES, 'UTF-8')), ENT_QUOTES, 'UTF-8');
}

And use it like this:

echo Slug('Iñtërnâtiônàlizætiøn of Glaño'); // internationalizaetion-of-glano

You can embed the Unaccent() code into the Slug() function if you wish to have only one function.

Alix Axel
  • 151,645
  • 95
  • 393
  • 500
  • this is a very interesting functions that is working without hardcoding accents with their replacements... is there any documentation for this function? – dynamic Mar 06 '15 at 10:44
4

Perhaps replace accented characters with the closest matching non-accented latin character.

Unless "Glano" means something very rude, this is probably your best bet.

If you search google for "Glaño" it returns pages with "Glano" in it anyway, so the SEO shouldn't be harmed.

To translate the characters from accented to unaccented, you could use this function (this is in PHP, but hopefully you'd be able to use it as a starting point for other languages):

function normalize ($string) {
    $table = array(
        'Š'=>'S', 'š'=>'s', 'Đ'=>'Dj', 'đ'=>'dj', 'Ž'=>'Z', 'ž'=>'z', 'Č'=>'C', 'č'=>'c', 'Ć'=>'C', 'ć'=>'c',
        'À'=>'A', 'Á'=>'A', 'Â'=>'A', 'Ã'=>'A', 'Ä'=>'A', 'Å'=>'A', 'Æ'=>'A', 'Ç'=>'C', 'È'=>'E', 'É'=>'E',
        'Ê'=>'E', 'Ë'=>'E', 'Ì'=>'I', 'Í'=>'I', 'Î'=>'I', 'Ï'=>'I', 'Ñ'=>'N', 'Ò'=>'O', 'Ó'=>'O', 'Ô'=>'O',
        'Õ'=>'O', 'Ö'=>'O', 'Ø'=>'O', 'Ù'=>'U', 'Ú'=>'U', 'Û'=>'U', 'Ü'=>'U', 'Ý'=>'Y', 'Þ'=>'B', 'ß'=>'Ss',
        'à'=>'a', 'á'=>'a', 'â'=>'a', 'ã'=>'a', 'ä'=>'a', 'å'=>'a', 'æ'=>'a', 'ç'=>'c', 'è'=>'e', 'é'=>'e',
        'ê'=>'e', 'ë'=>'e', 'ì'=>'i', 'í'=>'i', 'î'=>'i', 'ï'=>'i', 'ð'=>'o', 'ñ'=>'n', 'ò'=>'o', 'ó'=>'o',
        'ô'=>'o', 'õ'=>'o', 'ö'=>'o', 'ø'=>'o', 'ù'=>'u', 'ú'=>'u', 'û'=>'u', 'ý'=>'y', 'ý'=>'y', 'þ'=>'b',
        'ÿ'=>'y', 'Ŕ'=>'R', 'ŕ'=>'r',
    );

    return strtr($string, $table);
}

(Author credit goes to allixsenos at gmail http://php.net/manual/en/function.strtr.php)

nickf
  • 537,072
  • 198
  • 649
  • 721
0

I agree that unless "Glano" means something very rude, this is probably your best bet. Now, I want to add that if you care about SEO I would consider not having too many folders in the URL. One root, three sub-folders and then the file. This may hurt more than the special character.