3

load file.txt

Error using load

Unknown text on line number 1 of ASCII file

only.words.txt

"ÐºÐ°Ñ‚ÐµÐ³Ð¾Ñ€Ð¸Ñ ".

How can I load a text file saved in utf-8 (Cyrillic) into matlab and use the TMG matlab toolbox? I'm aware of a similar answer posted some time ago here. It doesn't solve my problem. TMG still doesn't work.

Bondrak
  • 1,370
  • 2
  • 9
  • 15
  • 1
    https://stackoverflow.com/questions/6863147/matlab-how-to-display-utf-8-encoded-text-read-from-file this helps? – Ander Biguri Nov 09 '17 at 20:04
  • Possible duplicate of [MATLAB: how to display UTF-8-encoded text read from file?](https://stackoverflow.com/questions/6863147/matlab-how-to-display-utf-8-encoded-text-read-from-file) – Aero Engy Nov 09 '17 at 20:20
  • Thank you. Still TMG isn't working. – Bondrak Nov 11 '17 at 01:52

1 Answers1

1

In order to handle UTF strings properly, you have to read them from your text file using a binary approach, as follows:

fid = fopen('mytext.txt','rb');
bytes = fread(fid,'*uint8')';
fclose(fid);

txt = native2unicode(bytes,'UTF-8');

At this point, your string will contain the correct values, but Matlab will still be unable to show it properly. To fix this problem, you either have to use the Java Swing underlying labels with a font that supports unicode characters:

import('java.awt.*');
import('java.swing.*');

lbl = JLabel();
lbl.setFont(Font('Arial Unicode MS',Font.PLAIN,30));
lbl.setText(txt);

or the undocumented function that modifies the default character set used by Matlab (which is, by default, set to ISO-8859-1):

feature('DefaultCharacterSet','UTF-8');
Tommaso Belluzzo
  • 23,232
  • 8
  • 74
  • 98
  • Thank you. I guess the problem is a specific package I'm using. I'm using the TMG package in matlab and it's not working with the Cyrillic characters. – Bondrak Nov 14 '17 at 01:47
  • this [workaround](https://www.mathworks.com/matlabcentral/answers/340903-unicode-characters-in-m-file) solves the problem by FORCING the required encoding: – ivan866 Oct 04 '22 at 02:13