0

I have a script that reads data from text file encoded in utf8. It reads characters one by one with fgetc(). When it reads simple ascii charactes it's fine but when it comes to š, č, ž... It doesn't work correctly. The simplified code looks like this:

$file = fopen($path);
$char = fgetc($file);
while( $char !== false) {
    $char = fgetc($file);
    fwrite(STDOUT, $char);
}

I tried to use

header('Content-type: text/plain; charset=utf-8');

at the beginning of the script, but it still doesn't work. I also tried to use utf8_encode($char) or utf8_decode($char), it didn't help. Is there any simple solution how to read utf8 characters and write them to output?

UPDATE:

The problem is that special characters are saved in two indexes, so when I call one fgetc I don't get the whole character. My solution for now is that when I get a character with ordinal number above 127 I call fgetc again and make a string from those two values from fgetc, then I can correctly fwrite loaded special character. Maybe it's not the best solution but I couldn't figure out anything better.

sykatch
  • 301
  • 1
  • 2
  • 13
  • What doesn't work correctly? – Alastair McCormack Mar 13 '16 at 19:35
  • The problem is that special characters are saved in two indexes, so when I call one `fgetc` I don't get the whole character. My solution for now is that when I get a character with ordinal number above 127 I call `fgetc` again and make a string from those two values from `fgetc`, then I can correctly `fwrite` loaded special character. Maybe it's not the best solution but I couldn't figure out anything better. – sykatch Mar 13 '16 at 20:22
  • please update your question with that information. I have an idea that may help you – Alastair McCormack Mar 13 '16 at 21:08
  • Is it possible to read the whole file? If so, the mb_* functions will allow you to get single chars at a time from a string. I'm not sure if PHP has a "read one multibyte char from file" method. – Alastair McCormack Mar 14 '16 at 13:56

1 Answers1

0

Did you already set $char before the loop?

while( $char !== false)

Else you will never start the while loop which could be the problem as you would never call $char = fgetc($file);

Cedrik P
  • 1
  • 3
  • I forgot to add it to the example, thanks for noticing. – sykatch Mar 13 '16 at 12:00
  • Okay in this case why don't you read in the whole File as a String with file_get_contents and try the approach that was given here: http://stackoverflow.com/questions/2236668/file-get-contents-breaks-up-utf-8-characters – Cedrik P Mar 13 '16 at 12:05
  • I was looking for a solution where my script could work the way it works now - reading characters one by one, I didn't want to load the whole file. – sykatch Mar 13 '16 at 12:30