5

I have a database on MS Access, that I use with PHP through a call with PDO and the odbc driver. I have French, Danish and Polish words in my database. No problem for French and Danish, but no way to have the Polish characters, I only get "?" instead.

Here is the code:

    try{
 $db = new PDO("odbc:DRIVER={Microsoft Access Driver (*.mdb, *.accdb)}; DBQ=$dbName; Uid=Admin;Pwd=;");
  }
  catch(PDOException $e){
    echo $e->getMessage();
  }
  $answer = $db -> query("SELECT * FROM dict_main WHERE ID < 20");
      while($data = $answer-> fetch() ){
          echo iconv("iso-8859-1","utf-8",htmlspecialchars($data['DK'])) . ' ';
          echo iconv("iso-8859-2","utf-8",htmlspecialchars($data['PL'])) . ' ';
          echo iconv("iso-8859-1","utf-8",htmlspecialchars($data['FR'])) . ' ';
        }

Please let me know if somebody has an idea, as I am running out of them and nothing seems to work, or if I should give more information about my problem that I didn't think of.

George
  • 769
  • 4
  • 11
  • 31

2 Answers2

3

It looks like htmlspecialchars() does not support ISO-8859-2. So it probably breaks the contents of $data['PL'] before it gets to iconv().

Try first converting the input string into UTF-8, then apply htmlspecialchars() to the UTF-8 string:

echo htmlspecialchars( iconv("iso-8859-2", "utf-8", $data['PL']) );
RandomSeed
  • 29,301
  • 6
  • 52
  • 87
  • thank you for the suggestion. It did not work though, same result, '?' instead of correct characters. – George Jul 09 '13 at 19:22
  • @George Hum, let's try to narrow down the situation. What is the output of just `echo iconv("iso-8859-2", "utf-8", $data['PL'])`? (notice the typo I just fixed in my above answer: iso-8859-1 => iso-8859-2) – RandomSeed Jul 09 '13 at 19:25
  • Oh and let's make sure your `iconv` supports this encoding too. Please check it from your shell: `iconv -l |grep -i iso-8859-2`. – RandomSeed Jul 09 '13 at 19:29
  • it's alright for the typo, I had tried both. the new suggestion brings the same result. I am wondering if by any chance the problem comes from Access output then, but I can't find a way to check/change the collation of the database – George Jul 09 '13 at 19:36
  • @George I can't tell for sure as I am no expert at all with Access. But it looks very much like there is something weird in your source database. Have you been able to confirm that the characters show correctly in Access? Also, please add " `charset=UTF-8;` " at the end of your DSN, just to make sure something did not get lost in transit. – RandomSeed Jul 09 '13 at 19:43
  • yes in Access it's displayed correctly, and I did add the charset=utf-8 but doesn't seem to make a difference – George Jul 09 '13 at 19:52
  • @George Maybe your source encoding is not actually the one you expect. What is the output of `echo mb_detect_encoding($data['PL'])`? E.g. if you added `charset=UTF-8;` to your DSN, I would expect the string be UTF-8. Also, under Windows, I wouldn't be surprised if the character set was "Windows-1252", so maybe try `echo iconv("windows-1252", "utf-8", $data['PL'])`. – RandomSeed Jul 11 '13 at 20:31
  • the output of what you asked is ASCII, does that make sense? the second suggestion though gives the same results as usual. In any case thanks for keeping trying to help me, I really appreciate – George Jul 12 '13 at 13:54
2

You are using PHP 5.3.13. Then i would expect the charset in new POD to do its job. (Prior to 5.3.6. you would have to use $db->exec("set names utf8");). So add the charset=utf8; to your connect line. I also expect your Access database to be UTF-8.

You can also try charset=ucs2; with and without htmlspecialchars( iconv("iso-8859-2", "utf-8", $data['PL']) );

$db = new PDO("odbc:DRIVER={Microsoft Access Driver (*.mdb, *.accdb)}; DBQ=$dbName; Uid=Admin;Pwd=;charset=utf8;");

or

$db = new PDO("odbc:DRIVER={Microsoft Access Driver (*.mdb, *.accdb)}; DBQ=$dbName; Uid=Admin;Pwd=;charset=ucs2;");

B.T.W.: Don't forget to set your output to UTF-8 at the top of your document.

<?php header('Content-Type:text/html; charset=UTF-8'); ?>

and/or

<meta http-equiv='Content-Type' content='text/html; charset=utf-8'>

If that still doesn't work i suspect that the encoding in your Access database is messed up.


Edit:

Only thing i can think of at this point is using odbc_connect directly and bypassing PDO but i think the problem is in ODBC (Access->ODBC). If that's the case this won't help:

$conn=odbc_connect("DRIVER={Microsoft Access Driver (*.mdb, *.accdb)}; DBQ=$dbName; Uid=Admin;Pwd=;charset=utf8", "", "");
$rs=odbc_exec($conn, "SELECT * FROM dict_main WHERE ID < 20");
odbc_result_all($rs,"border=1");
Rik
  • 1,982
  • 1
  • 17
  • 30
  • Thank you for all this help. I was doing both $db->exec("set names utf8"); and charset=utf8;, and also each without the other, and did not forget the charset in the meta. However same problem, it does not display properly the characters, not even French or Danish ones without the conversion. I guess then the problem comes from Access, but I can't find anywhere in the Internet or in the documentation why and how... – George Jul 15 '13 at 10:38
  • Maybe a long shot but you could also try `charset=ucs2;` with and without the iconv. (changed in my answer) – Rik Jul 15 '13 at 11:40
  • I also added some example code to check the output of your php/browser. And i needed the header line. The meta line was not sufficient to switch to utf-8. At least in my setup. – Rik Jul 15 '13 at 12:28
  • Thanks. Your test codes displays correctly for me, I had no problem to display polish characters in general, only those coming from the database are a problem. I tried the charset=ucs2, and unfortunately no new results here. So we haven't found a solution, only that the problem definitely comes from Access I guess. I can confirm that when I use phpMyAdmin then I have no problem with polish characters as well – George Jul 15 '13 at 13:53
  • Then i suspect it's a ODBC problem. You could try direct odbc_connect to rule out PDO but i think it's Access->ODBC. – Rik Jul 15 '13 at 17:44
  • I don't understand what you are suggesting I should do? – George Jul 15 '13 at 18:18
  • You could (as a last test) try the last code in my answer. The one with `$conn=odbc_connect`. Then PHP makes direct contact with the ODBC drivers (without the layer of PDO in between). But if that does not work, i'm lost too. I tried with a wamp-install and a small demo.accdb and also could not get ODBC to receive utf-8 characters. (And with PDO my httpd kept crashing.) – Rik Jul 15 '13 at 18:34
  • Oh sorry I didn't see the edit, that's why I wasn't sure. Indeed the code didn't show improvement, the polish characters are all question marks. Thanks a lot for all the effort for helping. If ever I find a solution I'll update the post, in the meantime I use a simple work around to write the polish characters differently in the database, such as SS for ś, and I replace this for the output. It'll have to do for now. – George Jul 15 '13 at 18:43
  • fyi: I tried inserting Polish characters into Access with insert / odbc_exec and got `ĄĆĘÅŃŚŹŻąć` in MS Access. Reading THAT back in ODBC/PHP i do get the correct characters. I searched for i method to influence the "Microsoft Access Driver" in a specific charset but could not find it. All `SET NAMES` solution where for MSSQL. If i come accross a solution i'll update it here too. – Rik Jul 16 '13 at 08:40
  • Okay that's interesting. Yes I had tried SET NAMES too but indeed it was not relevant here. Thanks again, and hope one day we'll figure it out! – George Jul 16 '13 at 11:29