-1

How can i remove only � (using curl To get data)

$str = "Check this out <a href=�http://www.somewebsite.com�>Somewebsite</a>, this is a great website
Windows� (XP 32bit/Vista/7/8/8.1)";

I just want � to be removed. I tried

$output = preg_replace("/[^A-Za-z0-9]/","",$str);

it remove html also ... but i want html

Anand S Kumar
  • 88,551
  • 18
  • 188
  • 176
Harinder
  • 1,257
  • 8
  • 27
  • 54
  • What are you ask? To solve your problem with the encoding or just to remove the characters that do not belong to the encoding of the string? – Federkun Aug 02 '15 at 11:57
  • @Leggendario Sorry ... i did not mention before ... i am using curl to get this data.... – Harinder Aug 02 '15 at 12:02
  • 1
    You have an **encoding problem** which you need to solve by **handling encodings correctly.** Not by removing "incorrect" characters. – deceze Aug 02 '15 at 12:14
  • @deceze i am using UTF-8 ... and try others also nut same result .... which encoding should i use ? – Harinder Aug 02 '15 at 12:15
  • possible duplicate of [HTML encoding issues - "Â" character showing up instead of " "](http://stackoverflow.com/questions/1461907/html-encoding-issues-%c3%82-character-showing-up-instead-of-nbsp) – AD7six Aug 02 '15 at 13:50
  • @Harinder if the above duplicate doesn't help - look for one of the **many** other duplicate question terms to search for: html utf8 BOM encoding problems. – AD7six Aug 02 '15 at 13:51

1 Answers1

1

Instead of doing a bad work-around like that, you should fix your charset issue instead. Your problem is likely that you don't use the same character-encoding in all levels of your application/scripts. Anything that has or can be set to a specific character-encoding, should be set to the same. The most general ones are below.

  • Save the document as UTF-8 (or UTF8 w/o BOM) (If you're using Notepad++, it's Format -> Convert to UFT-8 or UTF8 w/o BOM)
  • The header in both PHP and HTML should be set to UTF-8
    • HTML: <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />, inside the <head>-tag in your document.
    • PHP: header('Content-Type: text/html; charset=utf-8'); - PHP headers has to be set BEFORE any output is made (no HTML, no whitespace, no echo/print - nothing).

There are other aspects as well that might need to be set to UTF-8, it depends on what kind of PHP functions you are using and so on. But the above is generally a good start.

Qirel
  • 25,449
  • 7
  • 45
  • 62