0

I have a mysql table with a character stored as "û" (u with a circumflex). I am exporting this via php to csv. I have tried everyhting but just cant get û in the export file. I do see things like "?" or "È" and "û".

Question: how can I properly export the data from mysql to a CSV file using PHP so I see "û" in the DB as "û" in the exported file?

Our php drill

mb_internal_encoding("UTF-8");
$list = $readConnection->fetchAll($query);
$fp = fopen($file, 'w');
setlocale(LC_MONETARY, 'nl_NL');

# Loop over lines here
$line = utf8_encode($db_row);
fputcsv($fp, $line);

I have also tried adding a BOM fprintf($fp, chr(0xEF).chr(0xBB).chr(0xBF)); // set UTF 8 header, not working

snh_nl
  • 2,877
  • 6
  • 32
  • 62
  • Just a minor comment: the trema is the two-dot diacritical mark, and u with trema is `ü` instead of `û`, which is the u with a circumflex. – dgstranz May 24 '18 at 09:48
  • @Phylogenesis & deceze : I think the duplicate tag is set just a little too fast and seems to be done only based on title. For example this question is not even about Excel! – snh_nl May 24 '18 at 09:50
  • Correct & adjusted. circumflex – snh_nl May 24 '18 at 09:51
  • And this answer is maybe less "all the way through" https://stackoverflow.com/questions/279170/utf-8-all-the-way-through -- set names is no answer here .... and maybe the reason why I have just spent 1 hour reading blogs and answers about this ... that did not give the simple answer – snh_nl May 24 '18 at 09:53
  • I'm not sure how it's different at all. The fix was to ensure the database connection is set to UTF-8, then `fputcsv` will output the bytes as received from the database, which is UTF-8. Anything more particular that needed addressing? – deceze May 24 '18 at 09:58
  • Well ..... the title is "UTF-8 all the way through" versus "PHP export û (u with circumflex) from MySQL utf8_general_ci" which is a lot more specific .... it covers 1 specific case. And not "all the way through" - you also know it is different because the answers are not the same – snh_nl May 24 '18 at 10:01
  • There are many questions which are smaller in scope, but they all have the same underlying root cause: in 99% of all cases (anecdotally speaking), the issue is a missing database connection charset. We don't need to address that again and again for every possible use case permutation. – deceze May 24 '18 at 10:05
  • `?` and Mojibake (such as `û`) are symptoms of two different causes. See https://stackoverflow.com/questions/38363566/trouble-with-utf8-characters-what-i-see-is-not-what-i-stored for discussion of both. – Rick James May 24 '18 at 16:28
  • @deceze - Note that "all the way thru" says what you should do, while my SO Q&A addresses 5 _different_ symptoms with different causes. – Rick James May 24 '18 at 16:30
  • @deceze - 99% is a bit excessive. There are 4-6 _different_ things that need to be done. Uses sometimes have more than one thing wrong. – Rick James May 24 '18 at 16:35
  • thx. Sorry but the articles *did not* help me. And the Excel reference is not even the same topic. This answer did. – snh_nl May 25 '18 at 10:40
  • Why is this question tagged with [mysql]? – Rick James May 30 '18 at 00:03

1 Answers1

0

The anwer is to execute SET NAMES utf8 to the DB before you query data. All other settings are fine. The function utf8_encode is then also no longer needed.

The drill then becomes

mb_internal_encoding("UTF-8");
$list = $readConnection->fetchAll($query);
$readConnection->query("SET NAMES utf8");

$fp = fopen($file, 'w');
setlocale(LC_MONETARY, 'nl_NL');

# Loop over lines here
$line = $db_row;
fputcsv($fp, $line);

Please be aware! when checking this to use a TEXT editor. Excel uses a different encoding and will still show the characters as "È—" for example. When opening with a text editor we see the expected "û" character.

snh_nl
  • 2,877
  • 6
  • 32
  • 62