0

I am trying to read data from this png image, and then place the image length at the start of the data, and pad it a given number of spaces defined by my header variable. However, once I do that, the image length increases drastically for a reason beyond my knowledge. Please can someone inform me of what is happening? I must be missing something since I am still fairly new to this field.

HEADER = 10
PATH = os.path.abspath("penguin.png")
print(PATH)
with open(PATH,"rb") as f:
    imgbin = f.read()
    print(len(imgbin))
    imgbin = f"{len(imgbin):<{HEADER}}"+str(imgbin)
    print(len(imgbin))

when I first print the length of the data, I get a length of 163287, and on the second print, I get a length of 463797

Glooc
  • 17
  • 6

1 Answers1

0

This is because you are changing the data from binary string to a string when you load the image to when you pass it through str:

len(imgbin), len(str(imgbin))
>>> (189255, 545639)

(note I use a different image so the numbers are different). You can solve this issue by adding a binary string to the start like so:

with open(PATH,"rb") as f:
    imgbin = f.read()
    imgbin = f"{len(imgbin):<{HEADER}}".encode('utf-8')+imgbin
    print(len(imgbin))
>>> 189245
>>> 189255

You can find out more about binary strings here.

For reference it is worth noting that png images are uint-8 in type (i.e. 0-255). It is possible to manipulate them as binary strings because they can be utf-8 (i.e. the same size). However, it might be worth using something like numpy where you have uint-8 as a data type so as to avoid this.

Richard Boyne
  • 327
  • 1
  • 7