0

Okay, so i'm working on some plugins for some of our clients, that basically takes data from their MySQL database, and sends it to us via. XML generated with SimpleXML.

Now, when we receive the XML-file we run it through a script putting their data into our database, and here comes the problem.

When we put the clients data into our database, in some cases some of the characters are converted to chinese letters. (We use UTF-8)

Now i figure this could be resolved if i had a way to determine the encoding of the clients database, and convert to UTF8 and give the XML-file the header <?xml version="1.0" encoding="UTF-8" ?>. My problem is to detect the encoding given to me from the clients database, and converting it to UTF-8 properly.

I've had a look at php's mb_detect_encoding() and mb_convert_encoding, but are unsure of how common the "Multibyte String" extension is, and i would like to keep the compatibility of the plugin as high as possible.

Any ideas as to how i do this best? Let me know if you need more information.

EDIT: Okay, use of mysql_set_charset('UTF8') and setting SimpleXML to <?xml version="1.0" encoding="UTF-8" ?><xml/> does the job, thanks for the help.

Accepting daids answer since he was the one to lead me to this solution.

Kristoffer la Cour
  • 2,591
  • 3
  • 25
  • 36

2 Answers2

2

To get what charset MySQL return :

    SHOW VARIABLES LIKE "character_set_database";
    SHOW VARIABLES LIKE "collation_database"; 

mqsoh's answer (http://stackoverflow.com/questions/7880492/php-mysql-to-simplexml-ensure-proper-encoding/7880779#7880779) will convert data.

David Bélanger
  • 7,400
  • 4
  • 37
  • 55
  • I think `SET NAMES` sets the encoding for the entire database? I wouldn't want to mess up the rest of my clients output. Maybe `mysql_set_charset`?.. – Kristoffer la Cour Oct 24 '11 at 19:10
  • Well my site already use `SET NAMES` and `mysql_set_charset`, it's the client's encoding that's the problem, but i will try with `mysql_set_charset` for the clients side. And `mysql_query` does not support multiple queries. – Kristoffer la Cour Oct 24 '11 at 19:18
  • The problem with `utf8_encode` is that if the string is already utf8 encoded, the function with "double-encode" the string. – Kristoffer la Cour Oct 24 '11 at 19:20
0

This question suggested the use of iconv, which is an extension enabled by default in PHP.

Community
  • 1
  • 1
mqsoh
  • 3,180
  • 2
  • 24
  • 26