2

I have a multi-languages website, and I'm trying to create a friendly URL. In my database, I have the slug field. When the article's title is in english the slug appear in url and redirection works fine. but when the title is arabic the slug appear and the redirection shows "Object not found" page.

what seems to be the problem guys ? please help I'm stack.

  • URLs cannot contain arbitrary characters. Since there are some rules about how a URL is structured and since a URL contains some characters holding special meaning inside a URL you have to take care to create _valid _URLs, not just "some" URLs. So make sure you url-encode all tokens you put into your URLs. Apart from that we cannot say much without you providing more information. Like examples of the URLs and the code how you process requests to such URLs. – arkascha Apr 19 '15 at 14:03
  • but I've seen some websites and even wikipédia using arabic slug in their URLs. and second, I use a function that remove all special chars and symbols, and my charset is utf-8 :( I only have the problem with the arabic URL – Catalunya's Son Apr 19 '15 at 14:11
  • I did not say you cannot use arabic characters. I only said you have to escape, which apparently (according to your own words) you don't. Without escaping it is a simple question of luck if the URL works or not. Escaping is not rocket science: there is a function for that in php: `urlencode()`. Use it instead of trying to create some function yourself that tries to remove "special characters", whatever those are in your eyes. – arkascha Apr 19 '15 at 14:21
  • I got your function bro, but in my case, the arabic URL shows in my URL like this: `localhost/website/article/27/عنوان-الموضوع` but the page doesn't show any data but "Object not found" – Catalunya's Son Apr 19 '15 at 14:37
  • Sure, that URL matches your description above. I do not understand the "but" here. You have to pass "عنوان-الموضوع" through the urlencode() function before concatenating it to the base of the URL. Did you try that? And after that: you _still_ did not post your code. How do you expect us to help with that? What are your rewriting rules? How are the requests processed? – arkascha Apr 19 '15 at 18:01
  • I don't know if you get a notification not, so the codes are below – Catalunya's Son Apr 20 '15 at 00:16

2 Answers2

2

Most likely the issue is your rewriting rule. It explicitly is crafted such that it only gets applied for requests that consist of only ASCII characters, an underscore or a hyphen in the slug part of the URL. That obviously won't match arabic characters in the URL. So you have to change your rule to accept more or less anything expect very special characters:

RewriteEngine on
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME} !-l
RewriteRule ^([0-9]+)/([^/]+)/?$ article.php?id_art=$1 [NC,L]
arkascha
  • 41,620
  • 7
  • 58
  • 90
  • thank you brother. successfuly done :) I doubt it two days later but my mistake :/ I did not try to remove the latin letters. and by the way I made the array a, b to replace the accented letter, cause I wanted the URL to be perfect with no accented chars :) – Catalunya's Son Apr 20 '15 at 13:22
0

sorry I forgot, here is my slug function:

$a = array('À', 'Á', 'Â', 'Ã', 'Ä', 'Å', 'Æ', 'Ç', 'È', 'É', 'Ê', 'Ë', 'Ì', 'Í', 'Î', 'Ï', 'Ð',
            'Ñ', 'Ò', 'Ó', 'Ô', 'Õ', 'Ö', 'Ø', 'Ù', 'Ú', 'Û', 'Ü', 'Ý', 'ß', 'à', 'á', 'â', 'ã',
            'ä', 'å', 'æ', 'ç', 'è', 'é', 'ê', 'ë', 'ì', 'í', 'î', 'ï', 'ñ', 'ò', 'ó', 'ô', 'õ',
            'ö', 'ø', 'ù', 'ú', 'û', 'ü', 'ý', 'ÿ', 'Ā', 'ā', 'Ă', 'ă', 'Ą', 'ą', 'Ć', 'ć', 'Ĉ',
            'ĉ', 'Ċ', 'ċ', 'Č', 'č', 'Ď', 'ď', 'Đ', 'đ', 'Ē', 'ē', 'Ĕ', 'ĕ', 'Ė', 'ė', 'Ę', 'ę',
            'Ě', 'ě', 'Ĝ', 'ĝ', 'Ğ', 'ğ', 'Ġ', 'ġ', 'Ģ', 'ģ', 'Ĥ', 'ĥ', 'Ħ', 'ħ', 'Ĩ', 'ĩ', 'Ī', 'ī',
            'Ĭ', 'ĭ', 'Į', 'į', 'İ', 'ı', 'IJ', 'ij', 'Ĵ', 'ĵ', 'Ķ', 'ķ', 'Ĺ', 'ĺ', 'Ļ', 'ļ', 'Ľ', 'ľ',
            'Ŀ', 'ŀ', 'Ł', 'ł', 'Ń', 'ń', 'Ņ', 'ņ', 'Ň', 'ň', 'ʼn', 'Ō', 'ō', 'Ŏ', 'ŏ', 'Ő', 'ő', 'Œ',
            'œ', 'Ŕ', 'ŕ', 'Ŗ', 'ŗ', 'Ř', 'ř', 'Ś', 'ś', 'Ŝ', 'ŝ', 'Ş', 'ş', 'Š', 'š', 'Ţ', 'ţ', 'Ť', 
            'ť', 'Ŧ', 'ŧ', 'Ũ', 'ũ', 'Ū', 'ū', 'Ŭ', 'ŭ', 'Ů', 'ů', 'Ű', 'ű', 'Ų', 'ų', 'Ŵ', 'ŵ', 'Ŷ', 
            'ŷ', 'Ÿ', 'Ź', 'ź', 'Ż', 'ż', 'Ž', 'ž', 'ſ', 'ƒ', 'Ơ', 'ơ', 'Ư', 'ư', 'Ǎ', 'ǎ', 'Ǐ', 'ǐ',
            'Ǒ', 'ǒ', 'Ǔ', 'ǔ', 'Ǖ', 'ǖ', 'Ǘ', 'ǘ', 'Ǚ', 'ǚ', 'Ǜ', 'ǜ', 'Ǻ', 'ǻ', 'Ǽ', 'ǽ', 'Ǿ', 'ǿ');

    $b = array('A', 'A', 'A', 'A', 'A', 'A', 'AE', 'C', 'E', 'E', 'E', 'E', 'I', 'I', 'I', 'I', 'D', 'N', 'O',
            'O', 'O', 'O', 'O', 'O', 'U', 'U', 'U', 'U', 'Y', 's', 'a', 'a', 'a', 'a', 'a', 'a', 'ae', 'c',
            'e', 'e', 'e', 'e', 'i', 'i', 'i', 'i', 'n', 'o', 'o', 'o', 'o', 'o', 'o', 'u', 'u', 'u', 'u',
            'y', 'y', 'A', 'a', 'A', 'a', 'A', 'a', 'C', 'c', 'C', 'c', 'C', 'c', 'C', 'c', 'D', 'd', 'D',
            'd', 'E', 'e', 'E', 'e', 'E', 'e', 'E', 'e', 'E', 'e', 'G', 'g', 'G', 'g', 'G', 'g', 'G', 'g',
            'H', 'h', 'H', 'h', 'I', 'i', 'I', 'i', 'I', 'i', 'I', 'i', 'I', 'i', 'IJ', 'ij', 'J', 'j', 'K',
            'k', 'L', 'l', 'L', 'l', 'L', 'l', 'L', 'l', 'L', 'l', 'N', 'n', 'N', 'n', 'N', 'n', 'n', 'O', 'o',
            'O', 'o', 'O', 'o', 'OE', 'oe', 'R', 'r', 'R', 'r', 'R', 'r', 'S', 's', 'S', 's', 'S', 's', 'S',
            's', 'T', 't', 'T', 't', 'T', 't', 'U', 'u', 'U', 'u', 'U', 'u', 'U', 'u', 'U', 'u', 'U', 'u', 'W',
            'w', 'Y', 'y', 'Y', 'Z', 'z', 'Z', 'z', 'Z', 'z', 's', 'f', 'O', 'o', 'U', 'u', 'A', 'a', 'I', 'i',
            'O', 'o', 'U', 'u', 'U', 'u', 'U', 'u', 'U', 'u', 'U', 'u', 'A', 'a', 'AE', 'ae', 'O', 'o');


    $lettersNumbersSpacesHyphens = '/[^\-\s\pN\pL]+/u';
    $spacesDuplicateHyphens      = '/[\-\s]+/';
    // remove anithing that isn't latters, spaces, numbers, hyphens
    $slug = preg_replace($lettersNumbersSpacesHyphens, '', mb_strtolower($slug, 'utf-8'));
    // remove spaces and duplicate hyphens
    $slug = preg_replace($spacesDuplicateHyphens, '-', $slug);
    // trim left and right hyphens, remove any left over hyphens 
    $slug = trim($slug, '-');
    //replace all accent chars
    $slug = str_replace($a, $b, $slug);
    $slug = urlencode($slug);
    return $slug;
    }

and here are the .htaccess lines:

RewriteEngine on

RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME} !-l
RewriteRule ^([0-9]+)/([a-zA-Z0-9_-]+)/?$ article.php?id_art=$1 [NC,L]
  • Hm, should you have added this to the question above, using the ``edit`` button right below it? I mean, this is not an answer, is it? :-) No problem: you can still do that. Just copy the content into the question and delete this "answer" afterwards. – arkascha Apr 20 '15 at 07:22
  • I don't really see the point _why_ you replace all accented letters by unaccented counterparts. Apparently this is an attempt to make the slug suitable for URL, but as written before: such handcrafted attempts typically are not sufficient, you will again and again have cases where characters are left that break your URL to become invalid. But that is not even the point here. One question instead: why do you handle accented latin characters special, yet want to keep arabic and other writings unchanged? That does not make sense, sorry... – arkascha Apr 20 '15 at 07:25
  • However the real issue is in your rewriting rules: The rule itself explicitly states that it only treats such requests that consist of only ascii characters, underscore or hyphen and nothing else in the slug part of the URL. Obviously that will _not_ match arabic characters which probably is your issue at hand. I will write a separate answer for this, because that is probably what you have to change to get things working. Nevertheless: do not forget thinking about the remarks to your slug function I wrote above... – arkascha Apr 20 '15 at 07:28