1

I am looking for a way to generate single-page PDF files from text of arbitrary length, auto fit-to-page font size, with reasonable margins, centered H/W.

command --text="Text of arbitrary length" --output=one-page-file.pdf

That is, I want to re-create

magick -gravity center -background white -fill black -size 1728x972 -font /Users/marekkowalczyk/Library/Fonts/RobotoMono-Medium.ttf caption:"Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat." -background white -extent 1920x1080 long.pdf

where the output file is a "true" PDF, not an image file embedded in a PDF --- obviously substituting ImageMagick with a tool that generates PDF (PostScript? TeX?).

Sample output

  • 1
    **ImageMagick** is a *"raster image processor"*. It always converts everything to a raster (gridded bitmap) the moment you open it and it always produces a rasterised, bitmap output image - so you are out-of-luck in that respect. – Mark Setchell Mar 25 '21 at 11:50
  • 1
    The only assistance I can offer is that, if you like the way **ImageMagick** does it, you could produce a page that way and ask **ImageMagick** how many lines and what font size it came up with, if you use a command like this `magick -gravity center -size 1728x972 caption:"Lorem ipsum dolor sit amet" -format "%[caption:lines] %[caption:pointsize]\n" info:` You would need to specify the font-style, and then work out what you want to do with the information in regard to writing a PDF as text rather than image using some other package. – Mark Setchell Mar 25 '21 at 19:20
  • @MarkSetchell I know ImageMagick is of no direct help here. I'm using it just as an illustration of what I want. However, your idea of using IM as a source of typesetting information is a brilliant hack :) – Marek Kowalczyk Mar 25 '21 at 23:45
  • Maybe you could pass your text into `fmt` with successively wider widths until you get the same number of lines of output as **ImageMagick**. Then, using the resulting lines, iterate up through the font sizes with `enscript` till you get 2 pages of output, then back out your last font size increase. Pretty ugly, but maybe doable. Maybe `unoconv` or `libreoffice` to the rescue... – Mark Setchell Mar 26 '21 at 07:06
  • Another idea might be with the `reportlabs` module in **Python**. – Mark Setchell Mar 26 '21 at 07:19
  • Or if you can find a way in **HTML/CSS** to get text to fil a `
    `, you could pass that into `pandoc`. Likewise **Latex** into `pandoc`.
    – Mark Setchell Mar 26 '21 at 07:30

3 Answers3

1

Update 2022-03-03:
In onepg.sh a - (dash) has been added after paste -s -d ' ' to specify stdin as input. If ghostscript says "Could not open the file /dev/stdout" I suggest editing onepg.sh as follows: change /dev/stdout to %stdout, /dev/stderr to %stderr, and -f /dev/stdin to -f - (but leave : ${infile='/dev/stdin'} as is). (end update)


It's been a while since this question was asked. Nevertheless...

Here's a PostScript program (onepg.ps) and a POSIX shell script (onepg.sh) using ghostscript 9.50 to create a one-page PDF adjusting font size to fill the page. Run as:

echo 'BZZZT ~ Train leaving in 45 minutes' | ./onepg.sh > bzzzt.pdf

or, to convert a Central or Eastern European plaintext file to a PDF with blue text,

tocode=latin2 rgbtext='0 0 255' ./onepg.sh < some.txt > some.pdf

or, for a compact standalone PostScript file and a trace log file,

TRACE=x logfile=file.log psoutfile=file.ps outfile=file.null ./onepg.sh < file.txt

or, for an A5-size PNG file in landscape orientation,

PAPERSIZE=a5 landscape=x outfile=file.png ./onepg.sh < file.txt

The driver shell script

  • supplies default values for files, encoding, page size and margins, font etc.
  • converts and formats the input text - which may contain ~ (tilde blank) as section delimiter - using the standard tools iconv and sed
  • emits PostScript startup code to call the vert-centr procedure in onepg.ps
  • invokes ghostscript to produce the output file; default format is PDF
  • uses shell parameter expansion (documented here)

Caution: Long words in a very short text may be truncated due to enlarged font size.


I should mention that I do PostScript once in a purple moon.

For a one-page output efficiency isn't a big concern and the algorithm used is quite simple. The vert-centr proc invokes adjustfont which computes the font size so the text fills the artbox (extent of the page's meaningful content) by repeatedly calling linebreakr in a divide-and-conquer approach. It stops when the line count equals floor(artbox height / font size) or when the computed font size no longer changes. Finally vert-centr displays the page distributing excess vertical whitespace evenly between lines and centreing lines horizontally; no other formatting is done.

The encodefont proc supports ASCII (StandardEncoding), Latin-1, and Latin-2. Input text is converted by iconv's --to-code="...//TRANSLIT" and so may not be represented accurately. //TRANSLIT is convenient for UTF-8 input but leaves ? in the output if transliteration cannot be done.

If the onepg.sh script is invoked with a non-empty TRACE shell variable the artbox is outlined in the output file and following written to a trace log (stderr, by default):

  • artbox dimensions
  • table of the font size computation:
    1. font size min:max
    2. current font size
    3. line count
    4. floor(artbox height / font size)
    5. stringwidth of text in current font
  • Y coordinates of each line

Sample trace log:

artbox: x=71 y=67 w=452 h=707 x+w=523 y+h=774
szrg    ftsz    lnct    h/ftsz  textw
6:144   75      69      9       23870
6:75    40      34      17      12730
6:40    23      19      30      7320
23:40   31      26      22      9866
23:31   27      23      26      8593
27:31   29      24      24      9229
lnypos: 749.0 719.542 690.083 660.625 631.167 601.708 572.25 542.792 513.333 483.875 454.417 424.958 395.5 366.042 336.583 307.125 277.667 248.208 218.75 189.292 159.833 130.375 100.917 71.4584

File: onepg.ps

% onepg.ps -- convert text to fit one page, adapting font
%
% Notes:
% - invoke with accompanying POSIX shell script onepg.sh
% - intended for one-page texts, not for extreme-size texts or words
% - supports section breaks, e.g. (end para.~ Next para), see /SECT
%   NB: section delimiter must be followed by a word delimiter (blank)
% - /fsMin, /fsMax font sizes are defined in /adjustfont
% - for StandardEncoding /encodefont is not needed
% - use Latin-2 encoding vector for ISO 8859-2 compatibility
% - tested with ghostscript 9.50, evince 3.36.7, okular 1.9.3

/TRACE false def        % trace info flag
/SECT (~) 0 get def     % section delimiter char (use 7bit ascii)


/Trace { % (string) --> ...
  TRACE { print flush } if
} bind def

/strN { % any --> (string)
  32 string cvs
} bind def


% Concatenate N strings.
% (s1) (s2) (s3) ... (sN) n  -->  (s1s2s3...sN)
% origin: https://stackoverflow.com/a/12472783 (with comments)
/ncat {
    dup 1 add
    copy
    0 exch { exch length add } repeat
    string exch
    0 exch
    -1 1 {
        2 add -1 roll
        3 copy putinterval
        length add
    } for
    pop
} def 


% Split text into lines, call back for each, return line count.
% NB: newlines get no special treatment (so replace with word delimiter)
% stack: text word-delimiter maxwidth eolproc(lntext,lnwidth) --> lnct
/linebreakr {
  0 begin
    /eolproc exch def
    /maxlinewidth exch cvr def
    /delim exch def
    /qtxt exch def                  % queued text

    /qtxtlen qtxt length def
    /qtxtlnct 0 def
    /delimlen delim length def
    /delimwd delim stringwidth pop def
    {
        qtxtlen 0 le { exit } if
        /qtxtlnct qtxtlnct 1 add def
        /lntxt qtxt def             % rest of current line
        /lnlen 0 def
        /lnwidth 0.0 def
        { % process current line
            % string seek <search> post match pre true
            % string seek <search> string false
            lntxt delim search      % look for next delimiter
            /inq exch def           % queue not empty if found
            /nextword exch def
            /nextwordlen nextword length def
            inq { pop /lntxt exch def } if

            /atsect 0 def          % SECT at end of nextword?
            nextwordlen 0 ne { % if
              nextword nextwordlen 1 sub get SECT eq { % if
                /atsect 1 def
                /qtxtlnct qtxtlnct 1 add def
                /nextword nextword 0 nextwordlen 1 sub getinterval def
              } if
            } if
            % at end of line if passing max unless no words 
            % seen, in which case truncating a rather long word, 
            % cf. https://en.wikipedia.org/wiki/Longest_words
            /wordwidth nextword stringwidth pop def
            lnwidth wordwidth add maxlinewidth gt lnlen 0 gt and {
              exit      % FIXME: better to add delimwd before exit
            } if
            /lnwidth lnwidth wordwidth add delimwd add def
            inq not atsect 0 ne or {
              /lnlen lnlen nextwordlen add def
              exit
            } if
            /lnlen lnlen nextwordlen add delimlen add def
        } loop  % line
        % call back line+width
        qtxt 0 lnlen atsect sub getinterval lnwidth delimwd sub eolproc
        atsect 0 ne { () 0.0 eolproc } if   % call back linefeed
        % skip to next line
        /qtxtlen qtxtlen lnlen sub def
        /qtxt qtxt lnlen qtxtlen getinterval def
    } loop      % text
    qtxtlnct    % return line count
  end           % dict
} def
/linebreakr load 0 16 dict put


% Adjust font size to fill artbox by repeatedly calling linebreakr.
% stack: fontname artbox text word-delimiter --> fontsize linect
%
% Returns when linect == floor(artbox-height / fontsize)
% or when fontsize no longer changes after call to linebreakr.
%
% Detects and avoids oscillation as in:
%     height fontsize quotient linect
%       708     26     27.2     26
%       708     27     26.2     27
/adjustfont {
  0 begin
    /worddelim exch def
    /pgtext exch def
    /artbox exch def
    /fontname exch def

    /fsMin 6 def
    /fsMax 144 def
    /fontsize 1 def
    /ABX artbox 0 get def
    /ABY artbox 1 get def
    /ABW artbox 2 get def
    /ABH artbox 3 get def

    TRACE { % if
        % outline rectangle where text goes
        gsave
        .82 setgray  artbox rectstroke
        grestore

        % artbox coords
        % ... N ncat
        (artbox:)
        ( x=) ABX strN
        ( y=) ABY strN 
        ( w=) ABW strN 
        ( h=) ABH strN
        ( x+w=) ABX ABW add strN 
        ( y+h=) ABY ABH add strN 
        (\n)
        14 ncat Trace
        % fontsize computation table header
        (szrg\tftsz\tlnct\th/ftsz\ttextw\n) Trace
    } if

    { % loop
        /lastfs fontsize def
        % prefer smaller font size (using idiv)
        /fontsize fsMin fsMax add 2 idiv def
        fontname fontsize selectfont
        % count lines by splitting text using current font
        pgtext worddelim ABW { pop pop } linebreakr
        /linect exch def
        /lineqt ABH fontsize idiv def          % floor(ABH / fontsize)

        TRACE { % if
          % fontsize computation table row
          /textwd pgtext stringwidth pop def   % width in current font
          % ... N ncat
          fsMin strN (:) fsMax strN
          (\t) fontsize strN
          (\t) linect strN
          (\t) lineqt strN
          (\t) textwd cvi strN
          (\n)
          12 ncat Trace
        } if

        lineqt linect sub
        dup 0 eq                    % success
        fontsize lastfs eq or       % guard against infinite loop
        { pop exit } if
        0 lt { /fsMax fontsize def
        }{     /fsMin fontsize def
        } ifelse
    } loop
    fontsize linect     % return values
  end   % dict
} def
/adjustfont load 0 16 dict put


% Encode named font: fontname encid --> encfontname
% where encid is 0 StandardEncoding, 1 Latin-1, or 2 Latin-2
% e.g. /Helvetica 1 --> /encft1Helvetica
% origin of /encvec table: https://stackoverflow.com/a/14866794
/encodefont {
  0 begin
        /encid exch def
        /fontnm exch def

        /myfontnm {
            (encft)
            encid strN
            fontnm 64 string cvs
            3 ncat
        } def
        /encvec encid 1 eq 
            { ISOLatin1Encoding }
            { StandardEncoding } ifelse
        def
        encid 2 eq { % if
        /encvec
            % Latin-2: first 144 entries same as in ISO Latin-1
            ISOLatin1Encoding 0 144 getinterval aload pop
            % \22x
                /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef
                /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef
            % \24x
                /nbspace /Aogonek /breve /Lslash /currency /Lcaron /Sacute /section
                /dieresis /Scaron /Scedilla /Tcaron /Zacute /hyphen /Zcaron /Zdotaccent
                /degree /aogonek /ogonek /lslash /acute /lcaron /sacute /caron
                /cedilla /scaron /scedilla /tcaron /zacute /hungarumlaut /zcaron /zdotaccent
            % \30x
                /Racute /Aacute /Acircumflex /Abreve /Adieresis /Lacute /Cacute /Ccedilla
                /Ccaron /Eacute /Eogonek /Edieresis /Ecaron /Iacute /Icircumflex /Dcaron
                /Dcroat /Nacute /Ncaron /Oacute /Ocircumflex /Ohungarumlaut /Odieresis /multiply
                /Rcaron /Uring /Uacute /Uhungarumlaut /Udieresis /Yacute /Tcedilla /germandbls
            % \34x
                /racute /aacute /acircumflex /abreve /adieresis /lacute /cacute /ccedilla
                /ccaron /eacute /eogonek /edieresis /ecaron /iacute /icircumflex /dcaron
                /dcroat /nacute /ncaron /oacute /ocircumflex /ohungarumlaut /odieresis /divide
                /rcaron /uring /uacute /uhungarumlaut /udieresis /yacute /tcedilla /dotaccent
            256 packedarray def
        } if
        fontnm findfont           % load the font
        0 dict copy begin         % copy it to a new dictionary
        /Encoding encvec def      % replace encoding vector
        myfontnm /FontName def    % replace font name
        currentdict end
        dup /FID undef            % remove internal data
        myfontnm exch definefont pop  % define the new font 

        myfontnm        % return value
  end % dict
} def
/encodefont load 0 4 dict put


% Justify text vertically by adjusting font size, centre horizontally, 
% and display in artbox.
% stack: pagetext fontname mediabox artbox rgbtext rgbbkg --> ...
/vert-centr {
  15 dict begin
    /rgbbkg exch def
    /rgbtext exch def
    /artbox exch def
    /mediabox exch def
    /fontname exch def
    /pgtext exch def

    /worddelim ( ) def
    /ABX artbox 0 get def
    /ABY artbox 1 get def
    /ABW artbox 2 get def
    /ABH artbox 3 get def

    rgbbkg {255 div} forall setrgbcolor mediabox rectfill
    rgbtext {255 div} forall setrgbcolor 

    % adjust font size, select font, centre text vertically
    fontname artbox pgtext worddelim adjustfont
    /lnct exch def
    /fontsize exch def

    /lnyadj ABH fontsize lnct mul sub lnct div def  % even out excess
    /lnypos ABH ABY add lnyadj add 4 add cvr def    % +4 looks better
    (lnypos:) Trace
    % split text into lines and display
    % args: pagetext delimiter maxlinewidth eolproc
    pgtext worddelim ABW { 
        % eolproc: linetext linewidth --> ...
        /lnypos lnypos fontsize sub lnyadj sub def
        ABX lnypos cvi moveto 
        % centre text horizontally
        ABW sub -2 div 0 rmoveto show
        ( ) Trace lnypos strN Trace 
    } linebreakr pop
    (\n) Trace
    showpage
  end   % dict
} def

% ---- startup code here ----

Sample startup code:

%%Page: 1 1
/TRACE true def
(OBS! ~ Tåg till Göteborg avgår inom fyrtiofem minuter)
/Helvetica 1 encodefont
[0 0 595 842] [71 67 453 708 ] [0 0 0] [252 250 243]
vert-centr
%%Trailer

File: onepg.sh

#! /bin/sh
# Use ghostscript 9.50 to run onepg.ps
# e.g.
#   echo 'Train in 45 min' | ./onepg.sh > msg.pdf
#   tocode=latin2 rgbtext='0 0 255' ./onepg.sh < some.txt > some.pdf
#   TRACE=x logfile=file.log outfile=file.pdf ./onepg.sh < file.txt
#   infile=the.txt outfile=the.png devWpts=1600 devHpts=900 ./onepg.sh

# shellcheck disable=SC2223,SC2046,SC2086

## Set default values

: ${progps='./onepg.ps'}        ## PostScript program file
: ${TRACE=}                     ## non-empty to trace to ${logfile}
: ${psoutfile=}                 ## non-empty to emit raw PostScript
: ${infile='/dev/stdin'}        ## source text
: ${outfile='/dev/stdout'}      ## destination, e.g. my.pdf or my.ps
: ${logfile='/dev/stderr'}      ## e.g. my.trace.log or %stderr
: ${fromcode='UTF-8'}           ## encoding of ${infile}
: ${tocode='ASCII'}             ## ASCII | LATIN1 | LATIN2
: ${PAPERSIZE='a4'}             ## see `man paperconf`
: ${marginx=.12} ${marginy=.08} ## page margins (.08 = 8%)
: ${landscape=}                 ## non-empty for landscape orientation
: ${fontname='Helvetica'}       ## font name
: ${rgbtext='0 0 0'}            ## text colour RGB
: ${rgbbkg='252 250 243'}       ## background colour RGB

## Set up arguments

case ${tocode} in
  (LATIN2|latin2) encid=2 tocode='LATIN2//TRANSLIT' ;;
  (LATIN1|latin1) encid=1 tocode='LATIN1//TRANSLIT' ;;
  (ASCII|ascii|*) encid=0 tocode='ASCII//TRANSLIT' ;;
esac
case ${outfile} in
  (*.jpeg) gsdevice='jpeg' ;;
  (*.null) gsdevice='nullpage' ;;
  (*.pdf)  gsdevice='pdfwrite' ;;
  (*.png)  gsdevice='png16m' ;;
  (*.ps)   gsdevice='ps2write' ;;
  (*.txt)  gsdevice='txtwrite' ;;
  (*)      gsdevice='pdfwrite' ;;
esac
case ${PAPERSIZE} in
  ## portrait mode width and height dimensions in points
  (letter)
       : ${devWpts=612}  ${devHpts=792} ;;
  (a5) : ${devWpts=420}  ${devHpts=595} ;;
  (a4) : ${devWpts=595}  ${devHpts=842} ;;
  (a3) : ${devWpts=842}  ${devHpts=1191} ;;
  (*)  if test -z "${devWpts}"; then
           set -- $(LC_NUMERIC=C printf '%.0f ' \
                    $(paperconf -p "${PAPERSIZE}" -w -h))
           devWpts="$1"  devHpts="$2"
       fi ;;
esac
if test "${landscape}"
then _tmp="${devWpts}"  devWpts="${devHpts}"  devHpts="${_tmp}"
     _tmp="${marginx}"  marginx="${marginy}"  marginy="${_tmp}"
fi
mediabox2artbox() { ## x=$1 y=$2 w=$3 h=$4
  set -- "$3*$marginx" "$4*$marginy" "$3-$3*$marginx*2" "$4-$4*$marginy*2"
  printf '(%s+0.5)/1\n' "$@" | bc | paste -s -d ' ' -
}
: ${mediabox="0 0 ${devWpts} ${devHpts}"}       ## x y width height
: ${artbox="$(mediabox2artbox ${mediabox})"}    ## same, within margins


## Emit PostScript, run ghostscript

{   cat << ENDCMT
%!PS-Adobe-2.0
%%BoundingBox: ${mediabox}
%%Creator: ${0##*/}
%%Pages: 1
%%Title: ${infile%.*}
%%EndComments
ENDCMT
    ## copy program stripping non-DSC comments and indentation
    sed -e '/^%%/! s/[[:blank:]]*%[^%]*$//' \
        -e 's/^[[:blank:]]*//' -e '/./!d' "${progps}"
    ## startup code
    cat << HERE
%%Page: 1 1
${TRACE:+/TRACE true def}
HERE
    ## convert text to 8-bit PostScript string, 
    ##  escape backslashes, paren:s, and newlines, enclose in paren:s
    iconv -f "${fromcode}" -t "${tocode}" < "${infile}" |
    sed -e 's/[\\()]/\\&/g' -e '$!s/$/\\/' -e '1s/^/(/' -e '$s/$/)/'
    cat << ENDPS
/${fontname} ${encid} encodefont
[${mediabox}] [${artbox}] [${rgbtext}] [${rgbbkg}]
vert-centr
%%Trailer
ENDPS
} |
tee ${psoutfile:+"${psoutfile}"} |
gs -q -dBATCH -dNOPAUSE \
    -dDEVICEWIDTHPOINTS="${devWpts}" \
    -dDEVICEHEIGHTPOINTS="${devHpts}" \
    -sDEVICE="${gsdevice}" \
    -sOutputFile="${outfile}" \
    ${logfile:+-sstdout="${logfile}"} \
    -f /dev/stdin
urznow
  • 1,576
  • 1
  • 4
  • 13
  • Thank you! Looks like it's been a lot of work. I can't get the solution to work, though. ````$ echo 'Train in 45 min' | ./onepg.sh > msg.pdf usage: paste [-s] [-d delimiters] file ... GPL Ghostscript 9.55.0: **** Could not open the file /dev/stdout . **** Unable to open the initial device, quitting.```` – Marek Kowalczyk Mar 02 '22 at 15:39
  • @MarekKowalczyk: Well, it's been fun :) Looks like you're on a BSD system so add a `-` (dash) after `paste -s -d ' ' ` (in `onepg.sh`) to specify stdin as input. And make sure [paperconf](https://www.freebsd.org/cgi/man.cgi?paperconf) is installed if you want to use `$PAPERSIZE` other than `a3`, `a4`, `a5`, or `letter`. – urznow Mar 03 '22 at 10:10
  • Thanks! Actually, it's MacOS Mojave 10.14.6. After the modification, I'm still getting ````$ echo 'Train in 45 min' | ./onepg.sh > msg.pdf GPL Ghostscript 9.55.0: **** Could not open the file /dev/stdout . **** Unable to open the initial device, quitting.```` – Marek Kowalczyk Mar 04 '22 at 16:20
  • @MarekKowalczyk: So it's OpenBSD then; I don't respond to postings tagged macOS or BSD (as I cannot test) but yours wasn't. If the error isn't due to lack of write access or `$TMPDIR` I have only one suggestion left: apply the update I added to my answer on 2022-03-03 (near the top) to do away with problems predicated on `/dev/std*` being devices rather than files. – urznow Mar 05 '22 at 09:50
1

One possible solution using the fitting library from tcolorbox:

\documentclass{article}

\usepackage[
  margin=0.5in,
  papersize={8.5in,11in} 
]{geometry}

\pagestyle{empty}

\usepackage{lmodern}

\usepackage[fitting]{tcolorbox}

\newtcolorbox{mybox}{
  colback=white,
  colframe=white,
  width=\textwidth,
  fit to height=\textheight,
  halign=center,
  valign=center,
  fontupper=\sffamily, 
  fit basedim=150pt
}

\begin{document}

\begin{mybox}
  Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Ut purus elit, vestibulum ut, placerat ac, adipiscing vitae, felis. Curabitur dictum gravida mauris. Nam arcu libero, nonummy eget, consectetuer id, vulputate a, magna. Donec vehicula augue eu neque. Pellentesque habitant morbi tristique senectus et netus et malesuada fames ac turpis egestas. Mauris ut leo. Cras viverra metus rhoncus sem.
\end{mybox}%

\end{document}

enter image description here

0

I came up with the following hack.

  1. Create an SVG vector image.
convert\
    -gravity\
        center\
    -background\
        white\
    -fill\
        black\
    -size\
        1728x972\
    -extent\
        1920x1080\
    -font ~/Library/Fonts/RobotoMono-Medium.ttf\
    caption:"Lorem ipsum"\
    lorem.svg
  1. Convert it to a vector PDF. Note that generating a PDF directly with convert doesn't work because the file would just be an embedded bitmap image.

svg2pdf lorem.svg lorem.pdf

  1. Use ocrmypdf to add a layer of text. This step is necessary because the PDF from the previous step is just a vector image of letter shapes, unlike a PDF rendered by LaTeX etc.
ocrmypdf -l pol+eng --output-type pdfa --clean lorem.pdf lorem-ocr.pdf

Hacky as hell but gets the job done.

The proper solution would involve somehow accessing the ImageMagick internal layout engine and capturing its output before it's converted into a bitmap.