1

The subject is related to URL rewrite mode for Arabic encoding characters. We want to create a title rewrite for each title inserted into the database using PHP/MYSQL, in English case, every thing goes right, using the function below:

$rlink =preg_replace ('/ /i', '-', $link);
$rlink2 =preg_replace ('/[^a-z0-9\-]/i', '', $rlink);
$rlink2 =preg_replace ('/--/i', '-', $rlink2);
return $rlink2;

This function cannot be used for Arabic characters, it will eliminate all characters not into (a-z), so it cannot be used.

How proceed to used an Arabic title into the url using PHP/MYSQL like the below: http://localhost/html/test/الصفحة-الرئيسية/ ?

Thank you for your suggestions.

hakre
  • 193,403
  • 52
  • 435
  • 836

2 Answers2

0

You need to specify your charset as UTF-8 to be able to store arabic characters, so try adding these lines to your php before you insert the records into the db (assuming your running mysql)

mysql_query('SET character_set_results=utf8');
mysql_query('SET names=utf8');
mysql_query('SET character_set_results=utf8');

Also, there are a few other questions on SO that address a similar issue, check out:

How to make MySQL handle UTF-8 properly

setting utf8 with mysql through php

PHP/MySQL with encoding problems

Storing and displaying unicode string (हिन्दी) using PHP and MySQL

EDIT: I may have misread the question, were you trying to rewrite the physical URL to use arabic characters or store the URLs in MySql?

Community
  • 1
  • 1
ply
  • 1,141
  • 1
  • 10
  • 17
0

You can use the unicode character classes.

http://www.php.net/manual/en/regexp.reference.unicode.php

Which allows you to match unicode characters for their properties e.g. is a letter, number or punctuation mark

So your line:

$rlink2 = preg_replace ('/[^a-z0-9\-]/i', '', $rlink);

would become:

$rlink2 = preg_replace ('/[^\pL\pN\-]/i', '', $rlink);

where \pL means search for the class 'L' which is any letter, and \pN means search for the class 'N' which is any number.

I'm not an Arabic speaker, but I'm pretty sure you'll also need to include at least one other class to allow matching on the connector and other graphemes used to join Arabic letters into words, as those won't be classed as letters, but will be some form of punctuation mark.

Once you've extracted the regex, so long as your MySQL is set to use UTF8 that should be all you need.

Danack
  • 24,939
  • 16
  • 90
  • 122
  • Please can you upvote and accept the answer below or if you've fixed the problem yourself, say what the fix was. – Danack Jun 28 '12 at 02:03
  • #Danack57, i have used your instruction below and i put it in the script with ut8 encoding, '/[^a-z0-9\-]/u' not /i and the arabic characters appear right in the browser url and the preg_match work property with special characters. –  Jun 28 '12 at 08:49