7

Problem

VerbatimOut from the “fancyvrb” package doesn’t play nicely with UTF-8 characters.

Minimal working example:

\documentclass{minimal}
\usepackage[utf8]{inputenc}
\usepackage[T1]{fontenc}
\usepackage{fancyvrb}

\begin{document}
\begin{VerbatimOut}{\jobname.test}
é
\end{VerbatimOut}

\input{\jobname.test}
\end{document}

Error message

When compiled using pdflatex mini, this gives the error

File ended while scanning use of \UTFviii@three@octets.

A different error occurs when the sole occurrence of é above is replaced by something else, e.g. é */:

Package inputenc Error: Unicode char \u8:### not set up for use with LaTeX.

– indicating that in this case, LaTeX succeeds in reading a multi-byte UTF-8 character, but not knowing what to do with it (i.e. it’s the wrong character).

In fact, when I open the produced .test file manually, it contains the character é, but in Latin-1 encoding!

Proof: when I open the files in a hex editor, I get the following:

  • Original file: C3 A9 (corresponds to LATIN SMALL LETTER E WITH ACUTE in UTF-8)
  • Written file: E9 (corresponds to é in Latin-1)

Question

How to set VerbatimOut up correctly?

filecontents* (from “filecontents”) shows that it can work. Unfortunately, I don’t understand either code so I cannot fix fancyvrb’s code by replicating the logic from filecontents manually.

I also cannot use filecontents* instead of VerbatimOut because the former doesn’t work within a \newenvironment, while the latter does.

(Oh, by the way: vanilla Verbatim instead of VerbatimOut also works as expected. The error seems to occur when writing the file, not when reading the verbatim input)

Konrad Rudolph
  • 530,221
  • 131
  • 937
  • 1,214

5 Answers5

4

Is your end goal to write symbols and accents in Verbatim? Because you can do that like this:

\documentclass{article}
\usepackage{fancyvrb}
\begin{document}
\begin{Verbatim}[commandchars=\\\{\}]
\'{e} \~{e} \`{e} \^{e}
\end{Verbatim}
\end{document}

The commandchars option allows the \ { } characters to work as they normally would.

Source: http://ctan.mirror.garr.it/mirrors/CTAN/macros/latex/contrib/fancyvrb/fancyvrb.pdf

Steve Tjoa
  • 59,122
  • 18
  • 90
  • 101
  • Thanks for the hint but that solution isn’t usable because the saved verbatim code will be further processed by another program that doesn’t know about LaTeX – so I really need to be able to use Unicode characters directly. – Konrad Rudolph Jan 25 '10 at 15:13
  • Ah, okay. Then I am not quite sure. Good luck. – Steve Tjoa Jan 25 '10 at 15:19
  • Updated hyperlink: http://ctan.mirror.garr.it/mirrors/CTAN/macros/latex/contrib/fancyvrb/fancyvrb.pdf – MattAllegro Sep 03 '15 at 18:51
3

This is still unfixed? I'll take another look. What exactly do you want: your package to use VerbatimOut, or for it not to interfere with it?

Tests

TexLive 2009's Xelatex compiles fine. With pdflatex, version

This is pdfTeX, Version 3.1415926-1.40.10 (TeX Live 2009)

I get an error message that is rather more useful error message than you got:


! Argument of \UTFviii@three@octets has an extra }.
 
                \par 
l.8 é

? i \makeatletter\show\UTFviii@three@octets
! Undefined control sequence.
\GenericError  ...                                
                                                    #4  \errhelp \@err@     ...
l.8 é

If I were to make a wild guess, I'd say that inputenc with pdftex uses the pdftex primitives to do some hairy storing and restoring of character tables, and some table somewhere has got a rarely mistake in it.

Possibly related

I saw a post by Vladimir Volovich in the pdf-tex mailing list archives, all the way back from 2003, that discusses a conflict between inputenc & fancyvrb, and posts a patch to "solve the problem". Who knows, maybe he faced the same problem? It might be worth emailing him.

Charles Stewart
  • 11,661
  • 4
  • 46
  • 85
  • (Yes, this is still unfixed.) That’s indeed a completely different error – although I’d suspect that a `}` is missing solely because the UTF-8 parser has already read one char too many. But why are you getting “undefined control sequence” when trying to show the definition of the macro? – Konrad Rudolph Jun 13 '10 at 08:34
  • @Konrad: I'm afraid debugging problems throwing up \GenericError is something that I have had bad experiences with. I plan on trying again sometime, but it won't be in the next few days. – Charles Stewart Jun 13 '10 at 09:54
  • No worries. It’s a pretty big problem but unfortunately I don’t really have time to spend on it either at the moment. The easiest course would probably to contact the maintainer of the involved packages (i.e. fancyvrb and inputenc) so I’ll try that once I get the leisure to spend more time on this bug. – Konrad Rudolph Jun 13 '10 at 13:41
  • Still unfixed in TeXLive2016. – ScumCoder Jun 28 '16 at 22:39
2

XeTeX has much better Unicode support. The following run through xelatex produces “é” both in \jobname.test and the output PDF.

\documentclass{minimal}
\usepackage{fontspec}
\tracingonline=1
\usepackage{fancyvrb}

\begin{document}
\begin{VerbatimOut}{\jobname.test}
é
\end{VerbatimOut}

\input{\jobname.test}
\end{document}

fontspec loads the Latin Modern fonts, which have Unicode support. The standard TeX Computer Modern fonts don’t have the right tables for Unicode support.

If you use a character that does not have a glyph in the current font, by default XeTeX writes a blank space to the PDF and prints a warning in the log but not on the terminal. \tracingonline=1 prints the warning to the terminal.

andrewdotn
  • 32,721
  • 10
  • 101
  • 130
  • Yes, I know about XeTeX and I use it exclusively. But I need this for a general-purpose package and since accented characters **do** work in normal LaTeX I don’t really want to break what little Unicode support works. This *isn’t* a Computer Modern font problem. – Konrad Rudolph Jan 26 '10 at 17:52
2

On http://wiki.portal.chalmers.se/agda/pmwiki.php?n=Main.LiterateAgda, they suggest that you should use

\usepackage{ucs}
\usepackage[utf8x]{inputenc}

in the preabmle. I successfully used this in order to insert unicode into a verbatim environment.

Alex
  • 1,581
  • 2
  • 18
  • 31
  • 1
    Not all Unicode works, though. In particular, `utf8x` is pretty much deprecated in favour of plain `utf8`, and so is the package `ucs`. There might be solitary cases where your code works while mine doesn’t – but these will be the exception. Ultimately, the real solution is to bin pdflatex and use xelatex instead. I’ve made the switch two years ago, and never looked back. – Konrad Rudolph Jun 30 '11 at 15:27
1
\documentclass{article}

\usepackage{fancyvrb}

\usepackage[utf8]{inputenc}
\usepackage[T1]{fontenc}
\newenvironment{MonVerbatim}{%
\count0=128\relax %
\loop
   \catcode\count0=11\relax
   \advance\count0 by 1\relax 
   \ifnum\count0<256
   \repeat
   \VerbatimOut[commandchars=\\\{\}]{VerbatimText.tex}%
}{\endVerbatimOut}

\newcommand\test{A command producing accented characters éà}

\begin{document}
\begin{MonVerbatim}
     A little bit text in verbatim mode éà_].
     \test
\end{MonVerbatim}
Followed by some accented character éà.
\end{document}

This code is working for me with TeXLive 2018 and pdflatex. Yous should probably avoid changing catcode if you are using a 16 bits TeX (lualatex or xelatex).

You can use the package "iftex" to check the tex engine used.

Alan
  • 11
  • 1