I have a string containing urdu characters like 'بجلی' this is a 1x4 array. I want to save this to a file, which would be viewed externally. Although this string doesnt display in the main Command Window, but variable 'str' does hold it. When I save this using fprintf(fid, str), and open that file in notepad there appear 'arrows' instead on the original characters. I can easily paste my characters into notepad manually. Where is the problem?
Asked
Active
Viewed 1.1k times
8
-
Notepad uses special characters to determine character encoding of a file. You're probably not writing them. This is a weird notepad specific behavior. – Wug Sep 13 '12 at 23:03
-
@Wug I just used a hex dump to confirm that this indeed writes only '1A 1A 1A 1A' to the file. Matlab apparently believes that this is the UTF-8 unicode representation of that string, as given by unicode2native(str, 'UTF-8'). Online unicode codepoint lookups seem to disagree. – drhagen Sep 13 '12 at 23:36
3 Answers
9
You need to use fwrite() not fprintf():
fid = fopen('temp.txt', 'w');
str = char([1576, 1580, 1604, 1740, 10]);
encoded_str = unicode2native(str, 'UTF-8');
fwrite(fid, encoded_str, 'uint8');
fclose(fid);
verified with:
perl -E "open my $fh, q{<:utf8}, q{temp.txt}; while (<$fh>) {while (m/(.)/g) {say ord $1}}"
1576
1580
1604
1740

Alex
- 5,863
- 2
- 29
- 46
6
It's not really necessary to avoid fprintf
in order to write UTF-8 strings in a file. The idea is to open correctly the file:
f = fopen('temp.txt', 'w', 'native', 'UTF-8');
s = char([1576, 1580, 1604, 1740]);
fprintf(f, 'This is written as UTF-8: %s.\n', s);
fclose(f);
0
looking up every character in character map may seem hard. The code can be modified into the following code :
fid = fopen('temp.txt', 'w');
str = char(['س','ل','ا','م');
encoded_str = unicode2native(str, 'UTF-8');
fwrite(fid, encoded_str, 'uint8');
fclose(fid);
This seems to be easier but the only drawback is that it requires you to have Arabic/Persian/Urdo,... installed.