8

I have a string containing urdu characters like 'بجلی' this is a 1x4 array. I want to save this to a file, which would be viewed externally. Although this string doesnt display in the main Command Window, but variable 'str' does hold it. When I save this using fprintf(fid, str), and open that file in notepad there appear 'arrows' instead on the original characters. I can easily paste my characters into notepad manually. Where is the problem?

Alex
  • 5,863
  • 2
  • 29
  • 46
bilal.haider
  • 318
  • 1
  • 4
  • 18
  • Notepad uses special characters to determine character encoding of a file. You're probably not writing them. This is a weird notepad specific behavior. – Wug Sep 13 '12 at 23:03
  • @Wug I just used a hex dump to confirm that this indeed writes only '1A 1A 1A 1A' to the file. Matlab apparently believes that this is the UTF-8 unicode representation of that string, as given by unicode2native(str, 'UTF-8'). Online unicode codepoint lookups seem to disagree. – drhagen Sep 13 '12 at 23:36

3 Answers3

9

You need to use fwrite() not fprintf():

fid = fopen('temp.txt', 'w');

str = char([1576, 1580,  1604, 1740, 10]);

encoded_str = unicode2native(str, 'UTF-8');
fwrite(fid, encoded_str, 'uint8');

fclose(fid);

verified with:

perl -E "open my $fh, q{<:utf8}, q{temp.txt}; while (<$fh>) {while (m/(.)/g) {say ord $1}}"
1576
1580
1604
1740
Alex
  • 5,863
  • 2
  • 29
  • 46
6

It's not really necessary to avoid fprintf in order to write UTF-8 strings in a file. The idea is to open correctly the file:

f = fopen('temp.txt', 'w', 'native', 'UTF-8');
s = char([1576, 1580, 1604, 1740]);
fprintf(f, 'This is written as UTF-8: %s.\n', s);
fclose(f);
0

looking up every character in character map may seem hard. The code can be modified into the following code :

fid = fopen('temp.txt', 'w');
str = char(['س','ل','ا','م');
encoded_str = unicode2native(str, 'UTF-8');
fwrite(fid, encoded_str, 'uint8');
fclose(fid);

This seems to be easier but the only drawback is that it requires you to have Arabic/Persian/Urdo,... installed.