0

This is mostly an HTML question, but I am interested in extracting information from the HTML using Python 3.


My question is:

Given a font-family, font-weight, and font-style, as well as the text itself and the text-size, how can I determine the height and width of the text?

Namely, I have the coordinates of the upper left corner of the text, and I would like to find its lower right corner. I am ready to manually input the sizes of single letters in my script, if that's what it takes (hopefully not!).


As a base example, I have a font

f { font-family:sans-serif; font-weight:normal; font-style:normal; }

and a tag

<span id="f" style="font-size:20px;vertical-align:baseline;color:rgba(0,0,0,1);">Hello, world!</span>

I would like to calculate the width and height of the text, in pixels.


I am aware of related questions on the site, but I couldn't find any that would answer my specific question. Feel free to link such a question (if answered), if it exists.

1 Answers1

0

I have a workaround solution to this (not implemented yet). The parsing of the fonts is (often) done through referral to a .ttf (TrueFont) file. You can find such files in C:/Windows/Fonts (in Windows, not sure about other OS).

Using PIL/Pillow in Python, you can draw the bitmap of any string in a given font, see e.g. this answer. To be exact, you want to use PIL.ImageFont.ImageFont.getmask after initialising an instance of ImageFont with the appropriate .ttf file. Then you can just get the size of the mask and rescale to match the font size.