imagine a page Title string in any given language (english, arabic, japanese etc) containing several words in UTF-8. Example:
$stringRAW = "Blues & μπλουζ Bliss's ブルース Schön";
Now this actually needs to be converted into something thats a valid portion of a URL of that page:
$stringURL = "blues-μπλουζ-bliss-ブルース-schön"
just check out this link This works on my server too!
Q1. What characters are allowed as valid URL these days? I remember having seen whol arabic strings sitting on the browser and i tested it on my apache 2 and all worked fine.
I guesse it must become: $stringURL = "blues-blows-bliss-black"
Q2. What existing php functions do you know that encode/convert these UTF-8 strings correctly for URL ripping them off of any invalid chars?
I guesse that at least:
1. spaces should be converted into dashes
-
2. delete invalid characters? which are they? @
and '&'?
3. converts all letters to lower case (or are capitcal letters valid in urls?)
Thanks: your suggestions are much appreciated!