I am putting some user-provided content in my URLs for SEO purposes, using this code to clean it up:
/**
* Create URL friendly strings or filenames
* @param type $str
* @param type $replace
* @param type $delimiter
* @return type
*/
public static function toAscii($str, $replace=array(), $delimiter='-') {
if(!empty($replace)) {
$str = str_replace((array)$replace, ' ', $str);
}
$clean = iconv('UTF-8', 'ASCII//TRANSLIT', $str);
$clean = preg_replace("/[^a-zA-Z0-9\/_|+ -]/", '', $clean);
$clean = strtolower(trim($clean, '-'));
$clean = preg_replace("/[\/_|+ -]+/", $delimiter, $clean);
return $clean;
}
However, I found out it is not enough. An article with some Hebrew characters gave me:
iconv(): Detected an illegal character in input string
Is there a silver-bullet function out there to safely make strings into pretty URLs? At the very least I would like it NOT to crash. Then, it'd be nice if the URL still looked nice and SEO-friendly.
Today it was Hebrew, but tomorrow it may be Russian, Chinese, Klingon...