Read a string from text file returning strange characters in MATLAB

Question

I'm reading a string like "1.0.2" from text file with these codes :

reader = fopen('Address\My_Text.txt');
Out= textscan(reader,'%str'); 
Out1=Out{1} ; 
Out2=Out1{1};
fclose(reader);

This code (Out2) returns a string like this: ï»¿1.0.2 . This is a text file that copied by MATLAB from other place in HDD and read one time with above code for comparing with some existed text file and after that replace with this file using movefile (The main file is working correctly). When I create a text file manually and insert "1.0.2" in it, These codes read this value correctly. What is the problem? What is the solution for MATLAB?

Thanks.

See http://stackoverflow.com/questions/4931835/create-csv-file-from-c-extra-character-in-excel — Jongware, Aug 02 '14 at 21:53
@Jongware. Thank you for comment. these solutions are for C# and VB.Net. I need a solution for MATLAB. — Eghbal, Aug 02 '14 at 21:55
Ah, but you asked what the problem was. It's the BOM, written by your text creator. So there are several solutions (including "don't write it"). — Jongware, Aug 02 '14 at 21:56
That is a clue for finding a solution for MATLAB. Thank you. I edited my answer. What is (including "don't write it")? Can you describe it with more details? — Eghbal, Aug 02 '14 at 21:59
I meant "don't write the BOM into your text to begin with" if you have a choice there. You probably can read the first 3 bytes into MatLab, check if they form a BOM and then ignore them if so, or rewind the file and read "as usual" if not. But note that the BOM is a *strong* indication the text file may contain UTF8-encoded characters. Blithely ignoring it may not be wise, then. Possibly, as suggested in my link, you could tell MatLab to expect UTF8 text. — Jongware, Aug 03 '14 at 00:10
On the latter: [this question may help](http://stackoverflow.com/questions/6863147/matlab-how-to-display-utf-8-encoded-text-read-from-file). — Jongware, Aug 03 '14 at 00:13

Yvon · Accepted Answer · 2014-08-03T02:30:47.517

2

You can use fopen('My_Text.txt', 'r', 'n', 'UTF-8') to open this file in UTF-8 encoding. For the added 3 parameters, check documentation of fopen for details.

Inserting fseek(reader, 3, 'bof') before textscan may also fix this problem, in a different manner. ï»¿ is the BOM for UTF-8.

edited Aug 03 '14 at 02:30

answered Aug 03 '14 at 00:41

Yvon

2,903
1
14
36

Thank you for precise answer. – Eghbal Aug 03 '14 at 07:17

Read a string from text file returning strange characters in MATLAB

1 Answers1