Converting .txt to .pdf

Question

I have code to convert a .txt file to a .pdf. I'm 99% sure that it converts the file to .pdf, but it won't output the PDF file.

Below is my code. I got it from an online website, btw.

from fpdf import FPDF

pdf = FPDF()  
pdf.add_page()
pdf.set_font("Arial", size=15)

f = open("text-file-name.txt", "r")

for x in f:
    pdf.cell(200, 10, txt=x, ln=1, align='C')

pdf.output("completed.pdf")

If I run this code as it is, it comes up with this error:

Traceback (most recent call last):
  File "main.py", line 27, in <module>
    pdf.output("blh.pdf") 
  File "/home/runner/RoyalblueTimelyProperties/venv/lib/python3.8/site-packages/fpdf/fpdf.py", line 1065, in output
    self.close()
  File "/home/runner/RoyalblueTimelyProperties/venv/lib/python3.8/site-packages/fpdf/fpdf.py", line 246, in close
    self._enddoc()
  File "/home/runner/RoyalblueTimelyProperties/venv/lib/python3.8/site-packages/fpdf/fpdf.py", line 1636, in _enddoc
    self._putpages()
  File "/home/runner/RoyalblueTimelyProperties/venv/lib/python3.8/site-packages/fpdf/fpdf.py", line 1170, in _putpages
    p = self.pages[n].encode("latin1") if PY3K else self.pages[n] 
UnicodeEncodeError: 'latin-1' codec can't encode characters in position 228-233: ordinal not in range(256)

I have uploaded the file to replit already, does anyone know what to do?

If you use `f = open("text-file-name.txt", "r", encoding="latin-1")` you'll probably get rid of the error, but you might end up with a severe case of Mojibake. — Mark Ransom, Nov 24 '22 at 04:00
FPDF doesn't handle non-ASCII characters. Perhaps you should look at `reportlab` instead. — Tim Roberts, Nov 24 '22 at 04:00

score 0 · Answer 1 · answered Nov 30 '22 at 12:54

There are 2 things to be aware of:

Not every encoding (in your case latin-1) can represent every possible character. An encoding maps bit-patterns to characters. An encoding that uses 7 bits (and an 8th check-bit) is only able to represent 2^7 characters. The designers of the encoding thus have to make decisions about which characters are in the encoding.
Not every font can represent every encoding. Font files take up room, so there is a drive to limit the set of characters you want to represent. Each character takes up some rendering instructions (and thus bytes). Not to mention you need to hire an artist to actually design the characters, which takes money.

You are currently running into an error because of (1). You could simply read the file with another encoding.

Keep in mind though that when you are creating a PDF, you are (implictly perhaps) using a font.

PDF (ISO 32000) defines 14 standard type 1 fonts. These fonts are special. A conforming reader (e.g. Adobe Reader) is required to have them on hand. That means when you are creating a PDF (or writing a software library to create a PDF) it is a lot easier to use one of the standard 14 than any other font. You don't have to do any tricky "insert the font bytes in the PDF". Loads of PDF-creation libraries will default to one of the 14 if no font is specified.

These fonts all have an encoding. And those encodings may not support the characters that are currently in your text file. This can also be solved by using a non-default font.

Converting .txt to .pdf

1 Answers1