2

I have been trying to remove junk character from a stream of html strings using PHP but haven't been successfull yet. Is there any special syntax or logics to remove special character from the string?

I had tried this so far, but ain't working

 $new_string = preg_replace("�", "", $HtmlText); 
 echo '<pre>'.$new_string.'</pre>';
Arun Selva Kumar
  • 2,593
  • 3
  • 19
  • 30
  • 1
    if you see '�' it means you're using the wrong character encoding. It is used by PHP to represent anything it cannot render. So, it can be anything. – KIKO Software Feb 16 '15 at 10:56
  • 5
    When you get �'s there is always something wrong. Either by character encoding, page charset, database charset and the like.- Removing �'s is symptom treatment, but it does not cure the disease. – davidkonrad Feb 16 '15 at 10:58
  • As David said, you're database encoding is incorrect for content being stored. I'm guess you need some utf encoding and it's probably using a plain character set currently. Like stated, replacing/stripping will 'remove them but is a very poor answer :) – Brian Feb 16 '15 at 11:02
  • 1
    possible duplicate of [UTF-8 all the way through](http://stackoverflow.com/questions/279170/utf-8-all-the-way-through) – HamZa Feb 16 '15 at 11:23

3 Answers3

0
\p{S}

You can use this.\p{S} matches math symbols, currency signs, dingbats, box-drawing characters, etc

See demo.

https://www.regex101.com/r/rK5lU1/30

$re = "/\\p{S}/i";
$str = "asdas�sadsad";
$subst = "";

$result = preg_replace($re, $subst, $str);
vks
  • 67,027
  • 10
  • 91
  • 124
0

This is due to mismatch in Charset between database and front-end. Correcting this will fix the issue.

Arun Selva Kumar
  • 2,593
  • 3
  • 19
  • 30
-3

function clean($string) {

      return preg_replace('/[^A-Za-z0-9\-]/', '', $string); // Removes special chars.

}

Param sohi
  • 121
  • 9