How to use non-ASCII characters in Matlab figures (for use in LaTeX doc)?

Question

I am using including Matlab-drawn figures into LaTeX. My usual workflow is as following:

Script in matlab creates figure(s),
I tweak what I find needs to be tweaked in visual figure editor,
Figure is saved as .fig (for future modification) and .eps (for including in LaTeX),
I convert .eps files to .pdf,
PDF files are referenced in LaTeX source code.

To the point: when I try to use in axis labels, legend, titles, etc. non-ASCII chars, (to be exact: Polish national chars e.g. 'ą', 'ę', 'ś', 'ć') encoding in Matlab figure editor is fine and characters display properly. After exporting to .eps, they are all wrong (example: "Głębokość" turns into "G³êbokoœæ").

Does there exist a way to do this properly, either by tuning Matlab options or changing my workflow?

Note: I found that export to .png or other non-vector formats handles character encoding properly, but I would like to avoid having to do that -- I'm asking for a way to "keep it vector". Export directly to .pdf produces the same effect as .eps, e.g. it is producing wrong results.

PS. Matlab is R2008a, .latex files are compiled with pdflatex, .eps files with epstopdf from MikTeX 2.9 (all under Win7).

groovingandi · Accepted Answer · 2011-02-10T14:25:01.143

8

You could have a look at psfrag, that's what I usually use when I try to use Matlab figures in LaTeX. You basically put just tags into the figure in Matlab and replace those tags with LaTeX text afterwards. The biggest benefit is that this allows you to have identical symbols in text and figures.

Edit: when looking for the psfrag-URL, I found a Matlab script to simplify this: LaPrint.

edited Feb 10 '11 at 14:25

answered Feb 10 '11 at 14:18

groovingandi

1,986
14
16

I somehow ommited info that I use pdflatex. Do you have any ideas for not-so-ugly workaround? – triazotan Feb 10 '11 at 16:06
Matlab can interpret latex directly in the text fields. Have you tried putting latex construct like "\'c" in the matlab text field and setting the interpreter to "latex" instead of entering "ć" directly? – groovingandi Feb 10 '11 at 18:05
Yes, I have. Example you gave works, but it is only solution for three Polish accents: acute accent (\'), dot over letter (\.) and L with stroke (\L). Ogonek (i.e. the diacritic found in 'ą' or 'ę') is denoted by \k{} and is not interpreted properly by Matlab. One solution to this is using cedilla (\c{}), but is not exactly the same and looks kinda strange. – triazotan Feb 10 '11 at 18:54
I thought the Windows version of Matlab would be better in terms of LaTeX interpretation, I know there are some issues on the Linux version. If this doesn't work, psfrag is the only solution I see. This doesn't mean you can't use pdflatex for your complete document anymore, but of course it's cumbersome to create extra documents for your figures, compile them with latex and insert the output in your main documents. – groovingandi Feb 11 '11 at 09:32
I used cedilla-solution this time, because the document I'm making now is urgent and too big to devise a general solution for (at least) semi-automating figure compilation. Nevertheless, Psfrag is a great tip for future workflows, thanks! – triazotan Feb 11 '11 at 15:30

score 4 · Answer 2 · answered Feb 10 '11 at 16:30

Another possible solution would be to use matlab2tikz. It creates a tikz/pgfplot source file that may be included directly by your latex source. This means that it uses LaTeX's facilities for font rendering. You may directly edit the generated file to tweak the labels and such. Unfortunately, it doesn't work for all MATLAB figures.

score 1 · Answer 3 · edited May 23 '17 at 12:18

For exporting a Matlab figure with non-ASCII ISO-8859-1 characters, there is no problem on Windows, but on Linux with a UTF-8 locale there is a Matlab bug and a workaround. The question here targets characters that are not in ISO-8859-1, which is more tricky. Here is a solution that I posted on a related question.

If the number of characters needed is less than 256 (8-bit format) and ideally in a standard encoding set, then one solution is to:

Convert the octal code into the Unicode character;
Save the file into the target encoding standard (in a 8-bit format);
Add the encoding vector for the target encoding set.

For example, if you want to export Polish text, you need to convert the file into ISO-8859-2. Here is an implementation with Python (multi-platform):

#!/usr/bin/python
# -*- coding: utf-8 -*-
import sys,codecs
input = sys.argv[1]
fo = codecs.open(input[:-4]+'_latin2.eps','w','latin2')
with codecs.open(input,'r','string_escape') as fi:
    data = fi.readlines()
with open('ISOLatin2Encoding.ps') as fenc:
    for line in data:
        fo.write(line.decode('utf-8').replace('ISOLatin1Encoding','MyEncoding'))
        if line.startswith('%%EndPageSetup'):
            fo.write(fenc.read())
fo.close()

saved as eps_lat2.py; then running the command python eps_lat2.py file.eps, where file.eps is the eps created by Matlab, creates file_latin2.eps with Latin-2 encoding. The file ISOLatin2Encoding.ps contains the encoding vector:

/MyEncoding
% The first 144 entries are the same as the ISO Latin-1 encoding.
ISOLatin1Encoding 0 144 getinterval aload pop
% \22x
    /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef
    /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef
% \24x
    /nbspace /Aogonek /breve /Lslash /currency /Lcaron /Sacute /section
    /dieresis /Scaron /Scedilla /Tcaron /Zacute /hyphen /Zcaron /Zdotaccent
    /degree /aogonek /ogonek /lslash /acute /lcaron /sacute /caron
    /cedilla /scaron /scedilla /tcaron /zacute /hungarumlaut /zcaron /zdotaccent
% \30x
    /Racute /Aacute /Acircumflex /Abreve /Adieresis /Lacute /Cacute /Ccedilla
    /Ccaron /Eacute /Eogonek /Edieresis /Ecaron /Iacute /Icircumflex /Dcaron
    /Dcroat /Nacute /Ncaron /Oacute /Ocircumflex /Ohungarumlaut /Odieresis /multiply
    /Rcaron /Uring /Uacute /Uhungarumlaut /Udieresis /Yacute /Tcedilla /germandbls
% \34x
    /racute /aacute /acircumflex /abreve /adieresis /lacute /cacute /ccedilla
    /ccaron /eacute /eogonek /edieresis /ecaron /iacute /icircumflex /dcaron
    /dcroat /nacute /ncaron /oacute /ocircumflex /ohungarumlaut /odieresis /divide
    /rcaron /uring /uacute /uhungarumlaut /udieresis /yacute /tcedilla /dotaccent
256 packedarray def

Here is another implementation on Linux with Bash:

#!/bin/bash
name=$(basename "$1" .eps)
ascii2uni -a K "$1" > /tmp/eps_uni.eps
iconv -t ISO-8859-2 /tmp/eps_uni.eps -o "$name"_latin2.eps
sed -i -e '/%EndPageSetup/ r ISOLatin2Encoding.ps' -e 's/ISOLatin1Encoding/MyEncoding/' "$name"_latin2.eps

saved as eps_lat2; then running the command sh eps_lat2 file.eps creates file_latin2.eps with Latin-2 encoding.

It can easily be adapted to other 8-bit encoding standards by changing the encoding vector and the iconv (or codecs.open) parameter in the script.

score 1 · Answer 4 · edited Dec 08 '11 at 21:30

char(2048) will be shown by `print -depsc` as 'à ',
char(5064) as 'á',
char(28808) as 'ç',
char(37000) as 'é',
char(32904) as 'è', ...

For other characters in latin1 charset, Look at:

for j=0:4*64;clf;subplot(1,1,1);plot(eye(2));leg='';for i=4*(j+1)-1:-1:max(1,4*j);
str=['     ',num2str(i*64)];leg(i,:)=[str(end-4:end),':',char(64*i+(0:63))];
end;
title(leg,'interpreter','none');print('-depsc',['ascii',num2str(j),'.ps']);
end;

I am using pdflatex, so psfrag is not an option, and pdfrack seems to be broken.

How to use non-ASCII characters in Matlab figures (for use in LaTeX doc)?

4 Answers4

Linked

Related