1

kinda in a rush here. Im using TkFont to get the width of a string I get scrapping in a web. I use font 'Consolas' and I get 630 as output when the result should be 654 aprox. With other fonts such as 'Arial' the result is correct.

This is the code:

Tkinter.Frame().destroy()
txt = tkFont.Font(family='Consolas', size=-17)
width = txt.measure('If you have installed Selenium Python bindings, you can start using it')
return width

The page which from i am retrieving this data is in a localhost, so im afraid i can't share it, but the same problem emerges when I change font-family to 'Consolas' on tag 'p' in this website

EDIT

Used PIL as an alternative approach with no results.

This is the screenshot of the string on the web if modified as stated and below the PIL output, it's not the same sentence, but as you can see there is a difference in both fonts, being the same. I ignore if this is is a HTML format problem or PIL: enter image description here enter image description here

CarlosCRO
  • 33
  • 7
  • 1
    How do you know what the actual value should be? What tool are you using? Also, what version of python are you using? – Bryan Oakley Feb 16 '22 at 19:29
  • on Linux I get `700` but with `pillow.ImageFont` (and full path to `Consolas`) I get `654` – furas Feb 16 '22 at 20:58
  • Im using lightshot to see the area of the string in the web. Actually using python 3.6 – CarlosCRO Feb 16 '22 at 21:48
  • I got same result 630 using `tkinter.font.Font` and pillow's `ImageFont` in my Windows PC. As the character width (using `txt.measure('W')`) is 9 and the message is 70 characters, so 630 is the expected result for a monospaced font `Consolas`. I wonder why 654 is the expected answer. – acw1668 Feb 17 '22 at 04:04
  • The expected answer is the pixels occupied in the web this string is from. I don't know if I should have anything more in mind in this case, the letter-spacing is normal (it gets the one form the font) – CarlosCRO Feb 17 '22 at 07:34
  • maybe you should generate image with text to see if `Consolas` used in Python code lookes the same as `Consolas` used on web page. Maybe they use different files with fons or web browser render font in different way (using different library). Or maybe page uses size in `pixels` instead of `points`. – furas Feb 17 '22 at 07:36
  • The font the web uses is in pixels, but as you can see in the edit, the screenshot from the web doesn't have the same width. I can't see quite well if the problem resides on the spacing beetwen words but i don't think that's it. The browser im using it's chrome so I don't think that's the problem either, tested on Firefox with the same problem. – CarlosCRO Feb 17 '22 at 07:52
  • your screenshot has width 653px. You could save PIL image to see if it uses correct font. Maybe problem makes local font `Consolas`. – furas Feb 17 '22 at 12:30
  • I edited the post again, i can't see a real difference in both fonts – CarlosCRO Feb 17 '22 at 14:10

1 Answers1

0

On Linux your code gives me 700 but I don't see Consolas on list tkFont.families().


But I get your expected result with pillow.ImageFont (but it needs full path to Consolas)

from PIL import ImageFont

text = 'If you have installed Selenium Python bindings, you can start using it'

font = ImageFont.truetype('/full/path/to/Consolas.ttf', 17)
print('getbbox  :', font.getbbox(text))
print('getlength:', font.getlength(text))
print('getsize  :', font.getsize(text))

Result:

getbbox  : (0, 1, 654, 16)    # (x, y, width, height)
getlength: 654.0625
getsize  : (654, 16)

Doc: ImageFont


EDIT:

You can use pillow to generate image with text and compare it with text on screenshot.

from PIL import Image, ImageFont, ImageDraw

data = [
    ('If you have installed Selenium Python bindings, you can start using it', 17),
    ('This sample analysis model is expected to be contributing to students', 17),
    ('conduct academic research and studies, for teachers of English while selecting', 17),
]    
     
for number, (text, size) in enumerate(data, 1):

    font = ImageFont.truetype('/full/path/to/Consolas.ttf', size)

    print('text     :', text)
    print('font size:', size)    
    print('getbbox  :', font.getbbox(text))
    print('getlength:', font.getlength(text))
    print('getsize  :', font.getsize(text))
    print('----')

    image = Image.new('RGB', font.getsize(text))
    draw = ImageDraw.Draw(image)
    draw.text((0,0), text, font=font)
    image.save(f'text-{number}-size-{size}.png')

Result:

text     : If you have installed Selenium Python bindings, you can start using it
font size: 17
getbbox  : (0, 1, 654, 16)
getlength: 654.0625
getsize  : (654, 16)
----
text     : This sample analysis model is expected to be contributing to students
font size: 17
getbbox  : (0, 1, 645, 16)
getlength: 644.71875
getsize  : (645, 16)
----
text     : conduct academic research and studies, for teachers of English while selecting
font size: 17
getbbox  : (0, 1, 729, 16)
getlength: 728.8125
getsize  : (729, 16)
----

text-1-size-17.png

enter image description here

text-2-size-17.png

enter image description here

text-3-size-17.png

enter image description here

furas
  • 134,197
  • 12
  • 106
  • 148
  • Even with that, I get the correct result in that particular sentence, but when put, for example as text: "This sample analysis model is expected to be contributing to students" the output is 483, when on the web, it occupies 456. – CarlosCRO Feb 16 '22 at 22:06
  • Another proof, this sentence: 'conduct academic research and studies, for teachers of English while selecting' gives me 15px more than it should be – CarlosCRO Feb 16 '22 at 22:08
  • what font size do you use? Original may have extra margins on both sides and you may measure image without these margins. You can use `pillow` to generate image with text to see if you compare exactly the same image. – furas Feb 17 '22 at 01:14
  • you could add in question images with strings or even better links to pages with this text - so we could check what settings it has. Maybe it has some CSS settings which change size. – furas Feb 17 '22 at 01:24
  • I eddited the question with some others tests I made about this problem – CarlosCRO Feb 17 '22 at 07:09
  • Still doing more tests and I can't get to imitate your results even with the same code and dowloading the font ttf, starting to think that it could be the fact that I'm on Windows. – CarlosCRO Feb 17 '22 at 08:30
  • did you found any solution? – Arnav Mehta Sep 23 '22 at 07:36
  • @ArnavMehta I use Linux and it works for me. I can't check it on Windows. You may use `@CarlosCRO` to write to OP (`Original Poster`) – furas Sep 23 '22 at 10:19
  • Actually my issue is that, for different fonts if I put size = 50 and draw it on a image. I'm not really getting 50px as size – Arnav Mehta Sep 23 '22 at 10:22
  • @ArnavMehta what do you get? It doesn't have to gives exatly 50px but little more because 50px may have upper letter like `M` but together with `g`,`y`,`j` (like `Mg`) it may have more. Besides I'm not sure if it use `px` or `pt` (points) which can means points on paper (in printer) (similar to setting in other programs - like Word, Excel) – furas Sep 23 '22 at 10:46
  • @ArnavMehta other problem can be: font size may depends on screen resolution in `dpi` (`dots per inch`) and different systems may use different values. See [python - Why does the calculated width and height in pixel of a string in Tkinter differ between platforms? - Stack Overflow](https://stackoverflow.com/questions/2922295/why-does-the-calculated-width-and-height-in-pixel-of-a-string-in-tkinter-differ) – furas Sep 23 '22 at 10:53