1

The Swedish alphabet contains the letters

åäö

I try to read a csv file with the PHP function fgetcsv but I get encoding problems and the special characters are not correctly interpreted.

I open the file with fopen ( $filePath, "r" ) and I do not speciay any encoding that anywhere in PHP that I am aware of. Everything else in my application works fine regarding to encoding.

When I open the target csv file in open office suite I can select encoding. If I select Unicode(UTF-8) the special characters can not be displayed. If I select some ISO-8859, the letters are correctly displayed.

I have been playing around with utf8_decode, utf8_encode, mb_convert_encoding, iconv and setlocale with no luck.

I know what encoding is but I do not understand this case. It would be nice with a solution and a good explanation of what is going on here.

I guess my file is ISO-8859-* encoded

How can I parse the file correctly so I can make use of its content in PHP?

user264230
  • 630
  • 3
  • 7
  • 19

2 Answers2

1
Try this
    Å

    Å

    å

    å

    Ä

    Ä

    ä

    ä

    Ö

    Ö

    ö

    ö
karan
  • 164
  • 10
1

you could encoded your file, for example using htmlentities.

for example, with this litle code, i encoded the swedish file to ISO-8859-1,

$file = fopen("translations-sv.csv", "r");
$new_file = fopen("file_encoded.csv", "w");
while(!feof($file)) {

$line=fgets($file);
$line = str_replace(";", ",",$line);  //replace all ';' to ','
$encoded_line=htmlentities($line,ENT_QUOTES,'ISO-8859-1');

fwrite($new_file, $encoded_line);
}

fclose($file);
fclose($new_file);

Swedish.csv

title_orders;Beställningar
title_monthly_sales;Månadsförsäljning
title_settings;Inställningar

file_encoded.csv

title_orders,Beställningar
title_monthly_sales,Månadsförsäljning
title_settings,Inställningar

and, to compare,

$new_file = fopen("file_encoded.csv", "r");

$word_to_find="Orderslutförande";
while (!feof($new_file) ) {

    $line_of_text = fgetcsv($new_file, 1024,",");
if($word_to_find==$line_of_text[1]) 
 echo $line_of_text[1]." is the same to $word_to_find<br>";
}
fclose($new_file);
kraysak
  • 1,746
  • 1
  • 13
  • 14
  • How can I use the resulting file_encoded.csv in PHP to parse the file and compare to strings like artikelbenämning? This becomes Artikelben&iuml &iquest &frac12 mning and is hence not equal to artikelbenäming. I also get problems because of the ";". Thanks for your help! – user264230 Jul 21 '14 at 17:08
  • Maybe it has something to do with this? http://stackoverflow.com/questions/3637770/why-fgetcsv-drops-some-characters-with-diacritics – user264230 Jul 21 '14 at 17:12
  • what is the delimiter in your original csv file? if are ";" there would be a problem, becouse of the htmlentities, i didnt saw it... – kraysak Jul 21 '14 at 17:36
  • Yes it is ; :(. I cannot beleve that this is such a problem. Some people says that you should just put env(Lang... or set locate but it does not work either – user264230 Jul 21 '14 at 18:20
  • it gets encoded like this: Artikelben�mning but its not useful since it is not == Artikelbenäming. I want to opbtain the strings correctly for this to be useful. It may be printed as Artikelbenämning in html but I do not know – user264230 Jul 21 '14 at 19:41
  • I solved this by modifying the CSV file itself, thanks for the help annyway:) I now have other encoding problems... – user264230 Jul 22 '14 at 09:14