0

I am using dom htm document function to scrape html and store it into MySQl. but I have notieced that for foriegn languages like chinese or japanese etc. some wierd charactors are stored in MySQL and I dont think any one can read this..,门户,新闻,ータル,検索

so my question is can I convert this back into original form by using any code??

if not I want to eliminate this from my table beacuse there is no use of it.how can I eliminate only these charactors from table??

hakre
  • 193,403
  • 52
  • 435
  • 836
leon
  • 151
  • 1
  • 7

2 Answers2

0

This should do the trick if the input is actually UTF-8:

echo iconv('UTF-8', 'ASCII//IGNORE', 'news,门户,æ–°é—»,portal,网易,163,china,门户ç');
Álvaro González
  • 142,137
  • 41
  • 261
  • 360
-2

Best way is to store this data after doing base64 encoding

1.) UNICODE TO UTF8

2.) UTF8 TO BASE64

Hoshin
  • 454
  • 3
  • 14
  • Base64?? this isnt any encryption...I am not used any base64 encoding for data storage.. – leon Jan 27 '11 at 11:40