-1

I have some text as below:

$cnt = "header text�
first line
�

The second line

�

other line
�"

can I remove all special text like in my string? i used $cnt = str_replace("�", "", $cnt); but nothing change.

can you help me?

Marc B
  • 356,200
  • 43
  • 426
  • 500
Nguyen Bui
  • 31
  • 3
  • 1
    It seems a character encoding mess to me...maybe is produced when the values are stored in the database..you should use `utf8_encode($myval);` before you stored them... – Hackerman Jul 08 '14 at 17:05
  • It looks like line breaks that got broken. (Line breaks are commonly `\r\n` or `\n` character secuences.) Where do you get the string from? – Guffa Jul 08 '14 at 17:21
  • Hi Robert, I tried with your suggestion, but string with Vietnamese will show like this "Bá»nh do Än uá»ng" – Nguyen Bui Jul 08 '14 at 17:29
  • Hi Guffa, the string read from some pages – Nguyen Bui Jul 08 '14 at 17:31
  • @NguyenBui: How do you read the string? Do you read it from file? Does it come from user input? – Guffa Jul 08 '14 at 17:36
  • @Guffa: I use curl to get html content from web pages, then use Dom to get some elements. – Nguyen Bui Jul 08 '14 at 17:42
  • @NguyenBui: Then there are several things that could be wrong, like character encoding and line break parsing. If you can find out the character codes of those `�` characters that might give a hint. A character shown as `�` can be any character that isn't covered by the font that you are using to view the text. – Guffa Jul 08 '14 at 20:37

2 Answers2

0

What's your target character encoding? You may want to strip all non UTF-8 characters from your string. See here:

https://stackoverflow.com/a/4266468/1267408

Community
  • 1
  • 1
jontlymon
  • 61
  • 9
0

Try this out

$str = str_replace("\xEF\xBB\xBF",'',$str); 

It removes utf-8 bom (byte-order mark) characters.

Vivek Pratap Singh
  • 9,326
  • 5
  • 21
  • 34