Convert string of base64 back to base64 bytes

Question

I've uploaded an image using OpenCV and then encode it with base64 encoding using base64's b64encode.

>>> import cv2
>>> import base64
>>> image = cv2.cvtColor(cv2.imread("some_image.jpg"), cv2.COLOR_BGR2RGB)
>>> image_64 = base64.b64encode(image)
>>> image_64
b'//////////////////...
>>> type(image_64)
<class 'bytes'>

Then I convert it into a string using the str() method. This creates a string of the encoded image.

>>> image_64str = str(image_64)
>>> image_64str
b'//////////////////...
>>> type(image_64str)
<class 'str'>

Both of them (the <class 'bytes'> type and the <class 'str'>) looks similar. I attempted to decode them using base64's b64decode and the decode() function. However, an error occurred when I decoded the image_64str.

>>> image_64str.decode()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'str' object has no attribute 'decode'
>>> base64.b64decode(image_64str)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/Cellar/python/3.6.5/Frameworks/Python.framework/Versions/3.6/lib/python3.6/base64.py", line 87, in b64decode
    return binascii.a2b_base64(s)
binascii.Error: Incorrect padding

I fully understood what the errors were trying to tell me. But my question is, how can i convert the string of the encoded image (image_64str) back to bytes?

I've tried to use base64's 'b64encode` again on the string. However, it returns an error.

>>> str_to_b64 = base64.b64encode(image_64str)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/Cellar/python/3.6.5/Frameworks/Python.framework/Versions/3.6/lib/python3.6/base64.py", line 58, in b64encode
    encoded = binascii.b2a_base64(s, newline=False)
TypeError: a bytes-like object is required, not 'str'

Please do tell if anybody noticed what i was missing. I am using Python 3.6. Thanks in advance.

EDIT: Adding more description to my question.

I was able to enable AWS API Gateway binary support. My purpose is to pass an image as binary data through a POST request to the API and convert that to a PIL object so that i can process it in the backend using AWS Lambda. With API Gateway, the binary data was encoded with base64 binary.

I opened the image as binary data using python's open function (there were two images that I wanted to pass through the API). Then i use i use a dictionary to hold both the two images binary data, like

data = {"data1": img_binary_data_1, "data2": img_binary_data_2}

I send the POST request using python request library. One of the argument that i can pass in the post function is data, so i passed the image data using that.

I was able to send the request. In the Lambda backend, I wanted to convert the binary data to a PIL object for further processing. However, it seems that the data was packed into JSON format and the base64 encoded binary image had been turned in to a python string. I confirmed this by printing the data in the log of AWS CloudWatch.

I tried to use .decode(), but base here you cannot decode a string.

I was able to decode the string using b64decode(), returning a byte object. However when try to convert it to a PIL Object like

img = imread(io.BytesIO(base64.b64decode(b64_string)))

I received an error saying

OSError: cannot identify image file <_io.BytesIO object at 0x1101dadb0>

I tried some of the solution from this link, but apparently you cannot do this with byte-object.

I have tried ti use PIL.frombuffer and PIL.frombytes. However, they returned the not enough data value when i am very sure about the sizer of the image (in this case (256, 256)).

So my question is, how can i convert the base64 image in to a PIL object? I hope this helps to understand my question better. Thanks in advance.

`image_64str.encode()` should return a byte representation correct? — Lane Terry, Jun 24 '18 at 22:51
@LaneTerry running `image_64str.encode()` encodes the string of encoded image. If i try to decode it, it returns the string of the encoded image and the encoded image itself. — Imperator123, Jun 25 '18 at 06:06
@IgnacioVazquez-Abrams I am trying to pass it to AWS API Gateway. I am still trying to figure out AWS binary support so i thought i might be able to pass the binary data as a string using url query string parameters. — Imperator123, Jun 25 '18 at 06:11
@Imperator123 You aren't calling .encode() on your string in your example. Apologies for my English being unclear, but I meant to say that the `.encode()` method WILL return a byte representation of the string. Given your response above, however, could you not pass the data through API Gateway as an octet stream and handle it on your implementation behind API Gateway? — Lane Terry, Jun 25 '18 at 15:55

score 0 · Answer 1 · answered Jun 26 '18 at 02:03

Base64 is a binary -> char encoding so encoding an image makes sense, you get text bytes where a group of 6 bits is considered a character.

Now even if the above bytes are of characters, they are not python strings as python strings are utf-8.

When you convert the bytes to string it converts them to utf-8 and messes up the base64 padding (only = is allowed for padding) and what you get is a python string.

Now get the error when you decode it as it is not base64 encoding anymore. You can also not encode the string as base64 is bytes -> char and a string is not bytes.

Why are you converting the encoded bytes to string anyway? A little more description of your usecase would help.

Dave · Answer 2 · 2023-03-02T00:21:11.390

0

Following the small demo found here, if you called decode() to change from bytes to str instead of casting, you can then properly encode again back into bytes.

>>> image_64str = image_64.decode()
>>> image_64str
b'//////////////////...'
>>> type(image_64str)
<class 'str'>
>>> image_64_2 = image_64str.encode()
>>> image_2 = base64.b64decode(image_64str_2)

edited Mar 02 '23 at 00:21

answered Mar 02 '23 at 00:20

Dave

1
2

Convert string of base64 back to base64 bytes

2 Answers2

Linked