0

What type of codec is the best to encode binary files what are upload to an app write in Python 3?

hidura
  • 681
  • 3
  • 11
  • 36
  • 2
    What sort of files? Why do they need to be encoded? – Mark Byers Oct 09 '10 at 18:25
  • Any, i am testing with an image and when i used the 'raw_unicode_escape' because the utf-8 duplicate the size of the file,but with this the file preserve the original size but says this when i open:(Quantization table 0x01 was not defined) – hidura Oct 09 '10 at 18:29
  • 3
    In Python 3 images should be nowhere near the string/unicode encoding. They should be uploaded, processed and saved as bytes. Why are you having to specify `raw_unicode_escape`? What is the type of the image when it was uploaded? – Muhammad Alkarouri Oct 09 '10 at 18:54
  • A normal image but i can't use the utf-8 because duplicate the size of the file and when i try to access give me this error:Error interpreting JPEG image file (Not a JPEG file: starts with 0x3f 0x3f), so i try to use another encode and this is that most close size give me but give me the that i post before. – hidura Oct 09 '10 at 21:07
  • This is part of the code what i want to save: `ÿØÿà\x00\x10JFIF\x00\x01\x01\x00\x00\x01\x00\x01\x00\x00ÿÛ\x00C\x00\x08\x06\x06\x07\x06\x05\x08\x07\x07\x07\t\t\x08\n\x0c\x14\r\x0c\x0b\x0b\x0c\x19\x12\x13\x0f\x14\x1d\x1a\x1f\x1e\x1d\x1a\x1c\x1c $.\' ",#\x1c\x1c(7),01444\x1f\'9=82<.342ÿÛ\x00C\x01\t\t\t\x0c\x0b\x0c` and this is the error what gives me: Error interpreting JPEG image file (Not a JPEG file: starts with 0x22 0x22) any ideas? – hidura Oct 10 '10 at 04:49
  • what do you mean by *upload*? is it a web application and you are uploading a file using the web form? – mykhal Oct 10 '10 at 23:41

1 Answers1

0

I know it's a late answer, and the question isn't entirely clear, but if you need to encode a string of bytes to be saved somewhere, and for whatever reason you can't simply save the binary data, you should encode the string using the base64 module. For example, say you have a long string of bytes (in this case, generated with os.urandom(100):

import base64
import os
some_bytes = os.urandom(100)
print(repr(some_bytes))

some_bytes_base64 = base64.standard_b64encode(some_bytes).decode()
print(some_bytes_base64)

This prints something like the following:

b'\xd3\xa3\x0eT/L\xc6\x88\x98\xd0i\xbaa\xb2\x18\x1bx\xde\xf6Zq\xbe\xd9\xb4D\x19\x14\x91Z}wg?\x90\xd7f\xca\x1bxX\xa8\x99\xe6\xbb\xe0\x80\xa3\xc4\xb8\x1f\xfcp\xbc\x8d:/\xcfk\x01\xee\xc1\xde\xdc\xc3\xfa\x0e|7\x8eYt\xd1\x0b\x8a\x89\x9c^\xcf\xbc\\\x00_\x89dyb\x13\xa8\xdb\xba\xe3\x85X\x8d\x1a\x8bz\xe5\xfb\r'

06MOVC9MxoiY0Gm6YbIYG3je9lpxvtm0RBkUkVp9d2c/kNdmyht4WKiZ5rvggKPEuB/8cLyNOi/PawHuwd7cw/oOfDeOWXTRC4qJnF7PvFwAX4lkeWITqNu644VYjRqLeuX7DQ==

As you can see, the string has been properly encoded. The decode() method converts the byte literal that base64.standard_b64encode normally returns to a standard string. This answer describes byte literals well, in my opinion.

This should allow you to transmit a file, sequence of bytes, etc. without problems. (After all, isn't that was Base64 encoding is for?)

Community
  • 1
  • 1
Ricardo Altamirano
  • 14,650
  • 21
  • 72
  • 105