10

I am using PHP to access data on old machines and output them.

Putty shows:

▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒
▒NONE.
▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒

Its the weird formatting in a attempt to show data in a more clean way

PHP echo-ed chrome shows:

������
�NONE. �
������

I have tried:

$Str1 = str_replace("▒","",$Str1);

But it doesn't filter them out. The output is already utf 8.

Does anyone know how to filter out these things? Maybe identify what � is to php?

MoonEater916
  • 436
  • 2
  • 6
  • 19
  • Figure out what the actual *byte value* is, possibly convert the encoding to UTF-8. – deceze Oct 03 '17 at 09:44
  • 1
    Possible duplicate of [PHP: Convert any string to UTF-8 without knowing the original character set, or at least try](https://stackoverflow.com/questions/7979567/php-convert-any-string-to-utf-8-without-knowing-the-original-character-set-or) – Martin Oct 03 '17 at 09:46
  • So its utf 8 already. I could try to find the byte value but I am not even sure how to go about doing that. – MoonEater916 Oct 03 '17 at 10:30

2 Answers2

18

Try this:

$Str1 = preg_replace('/[\x00-\x1F\x7F-\xFF]/', '', $Str1);
Mayank Pandeyz
  • 25,704
  • 4
  • 40
  • 59
Er Nilay Parekh
  • 569
  • 5
  • 17
2

The problem with a regex like '/[\x00-\x1F\x7F-\xFF]/' is that it simply demolishes all UTF-8. So, about only 1% or less of all possible characters will work with this. Full Working Demo

Why not prevent what is causing this?

With a fully utf-8-configured DB and proper headers, this problem can happen if you use:

HoldOffHunger
  • 18,769
  • 10
  • 104
  • 133