0

Possible Duplicate:
how to replace special characters with the ones they’re based on in PHP?

I have a string that looks like this:

ABCÇĆDEFGHÎIïJ123450086

In PHP how can I make it appear as:

ABCDEFGHIJ123450086

without having to manually replace each character not needed. Can I use some kind of RegEx for this? How?

I just want A-Z and 0-9, no other foreign characters (as in, remove them).

Community
  • 1
  • 1
Ethan Allen
  • 14,425
  • 24
  • 101
  • 194
  • Your question is _extremely_ unclear. Do you want to remove the characters? Convert them to non-accented form? What about `א`? – SLaks Jan 10 '13 at 20:50
  • Check this http://stackoverflow.com/a/1891343/948301 – ravi404 Jan 10 '13 at 20:51
  • You should look at the `strtr()` function, here is an example function someone has posted. http://us3.php.net/manual/en/function.strtr.php#90925 – kittycat Jan 10 '13 at 21:07

5 Answers5

4

Use character classes:

$string = preg_replace('/[^\w\d]/', '', $string);

Replaces all occurences of characters which are not ([^]) alphabetic (\w), nor a digit (\d) with an empty string.

knittl
  • 246,190
  • 53
  • 318
  • 364
2

A nice function :

/**
 * Strip accents
 *
 * @param string $str string to clean
 * @param string $encoding encoding type (example : utf-8, ISO-8859-1 ...)
 */
function strip_accents($str, $encoding='utf-8') {
    // transforme accents chars in entities
    $str = htmlentities($str, ENT_NOQUOTES, $encoding);

    // replace entities to have the first nice char
    // Example : "&ecute;" => "e", "&Ecute;" => "E", "Ã " => "a" ...
    $str = preg_replace('#&([A-za-z])(?:acute|grave|cedil|circ|orn|ring|slash|th|tilde|uml);#', '\1', $str);

    // Replace ligatures like : Œ, Æ ...
    // Example "Å“" => "oe"
    $str = preg_replace('#&([A-za-z]{2})(?:lig);#', '\1', $str);
    // Delete else
    $str = preg_replace('#&[^;]+;#', '', $str);

    return $str;
}

// Example
$texte = 'Ça va mon cœur adoré?';
echo suppr_accents($texte);
// Output : "Ca va mon coeur adore?"

Source : http://www.infowebmaster.fr/tutoriel/php-enlever-accents

Fabien Sa
  • 9,135
  • 4
  • 37
  • 44
0

Assuming you want to remove them, You can use preg_replace to replace all characters that are not in the ranges a-z, A-Z and 0-9 with '';

Otherwise use the translation technique given in the other thread.

Peter Wooster
  • 6,009
  • 2
  • 27
  • 39
0

You can always use regular expressions.

preg_replace('/^[A-Za-z0-9]/', '', $some_str)

Mr. Llama
  • 20,202
  • 2
  • 62
  • 115
0

Use a whitelist:

$input = 'ABCÇĆDEFGHÎIïJ123450086';
$filtered = preg_replace("~[^a-zA-Z0-9]+~","", $input);
bitWorking
  • 12,485
  • 1
  • 32
  • 38
  • That would remove the extended characters, not replace them with their ASCII "equivalents". – glomad Jan 10 '13 at 20:56
  • It works!! I've tried it. – bitWorking Jan 10 '13 at 20:58
  • 1
    Yes, it *works*, but it does not do what the OP is asking. OP wants something that will turn `ABÇ` into `ABC`. Your answer will turn it into `AB`. – glomad Jan 10 '13 at 21:00
  • @ithcy: you have to read the question. There aren't 3 `C` in `ABCDEFGHIJ123450086`. – bitWorking Jan 10 '13 at 21:01
  • I did read the question. You are misunderstanding. Try your answer on the string `'ABÇ'`. It will become `'AB'` because you are *stripping out* the extended characters instead of *replacing* them with ASCII. Get it? – glomad Jan 10 '13 at 21:03
  • @ithcy: No you are wrong. Run my script! The result is: `ABCDEFGHIJ123450086` and this is what he was looking for. – bitWorking Jan 10 '13 at 21:05
  • @ithcy: yes, that's exactly what's being asked. Example input: `CÇĆ`; Example result: `C`. Only the `C` got kept. – Mr. Llama Jan 10 '13 at 21:06
  • Sorry but I am not wrong. Try your answer on `'ABÇ'`. Just those 3 characters. Not `'ABCÇĆDEFGHÎIïJ123450086'`. Now do you understand? The expected result would be `'ABC'`. Do you get `ABC`? No you don't. You get `AB`. – glomad Jan 10 '13 at 21:06
  • Better yet, just try it on `Ç`. You will get an empty string, where the OP wants `C`. Now do you understand?? – glomad Jan 10 '13 at 21:08
  • @ithcy: don't be bullish – bitWorking Jan 10 '13 at 21:08
  • Sorry, I don't mean to be! I just got frustrated. My apologies :) – glomad Jan 10 '13 at 21:09
  • It will be really funny if the OP comes along now and says "actually I did want the empty string" :) – glomad Jan 10 '13 at 21:12