1

I was reading about Bytes libraries and Pandas and I think the answers in the following post should work for me:

How to convert bytes data into a python pandas dataframe?

However, neither of them are working.

I have a CSV file with a few content. When I load it to Odoo, it returns the file into a Bytes object. I want to read this Bytes object with pandas and convert it into a dataframe in order to using dataframe methods.

The Bytes object comes in the attribute file_to_import of my class (that is why you'll see self.file_to_import in my code). If I show its type it returns <class 'bytes'>. If I decode it, I get its respective string:

ZGVmYXVsdF9jb2RlO25hbWU7ZGVzY3JpcHRpb25fc2FsZTtjYXRlZ29yeV9pZC9pZDtzdGFuZGFyZF9wcmljZTtsaXN0X3ByaWNlOzs7dHlwZTtiYXJjb2RlO3NlbGxlcl9pZHMvbmFtZS9pZDtzZWxsZXJfaWRzL3Byb2R1Y3RfbmFtZTtzZWxsZXJfaWRzL3Byb2R1Y3RfY29kZQpXNS5GLTA2NjY2ODtOZXN0YSBDaHJvbWUgNjA7TmVzdGEgQ2hyb21lIDYwLiBDYWxkZXJhIGRlIGNvbmRlbnNhY2nDs24gbXVyYWwgZGUgZ2FzIGRlIDYwIGtXLCBjb24gcXVlbWFkb3IgZGUgcHJlbWV6Y2xhIGUgaW50ZXJjYW1iaWFkb3IgcGlyb3R1YnVsYXIgYXV0b2xpbXBpYWJsZSBkZSBhY2VybyBpbm94aWRhYmxlLCByYXRpbyBkZSBtb2R1bGFjacOzbiAxMDoxMDA7Q2FsZGVyYXMgeSBjYWxlbnRhZG9yZXMgZGUgYWd1YTsxMTI5OzM2NTA7U0k7U0k7cHJvZHVjdDs7QUlDIFNBOzYwIGtXIFdhbGwgaHVuZyBib2lsZXIgaW4gY2FydG9uIGJveCB3aXRoOiBib2lsZXIgQW5jbGFqZSBwYXJlZCwgcGxhc3RpYyBzaXBob24gd2l0aCBnYXNrZXQgYW5kIGNsaXAsIHVzZXIncyBtYW51YWwgaW4gRW5nbGlzaCBsYW5ndWFnZTtXNS5GLTA2NjY2OApXMS5GLTA2NjY2OTtTYWZhcmkgMTAwMDtTYWZhcmkgMTAwMCBwbHVzIDI4OTM0MDE7Q2FsZGVyYXMgeSBjYWxlbnRhZG9yZXMgZGUgYWd1YTsxMTI5OzM2NTA7U0k7U0k7cHJvZHVjdDs7QUlDIFNBOzYwIGtXIFdhbGwgaHVuZyBib2lsZXIgaW4gY2FydG9uIGJveCB3aXRoOiBib2lsZXIgQW5jbGFqZSBwYXJlZCwgcGxhc3RpYyBzaXBob24gd2l0aCBnYXNrZXQgYW5kIGNsaXAsIHVzZXIncyBtYW51YWwgaW4gRW5nbGlzaCBsYW5ndWFnZTtXMS5GLTA2NjY2OQo7OztTdXN0aXR1aXIgQ2FsZGVyYXMgeSBjYWxlbnRhZG9yZXMgZGUgYWd1YSBwb3IgY8OzZGlnbyByYXJvO8K/UHJlY2lvIGRlIGNvbXByYSBkZSB0b2RvcyBsb3MgcHJvdmVlZG9yZXMgbyBzw7NsbyBkZSBlc3RlIHByb3ZlZWRvciBjb25jcmV0bz87O8K/Pzs7OztTdXN0aXR1aXIgQUlDIFNBIHBvciBzdSBjw7NkaWdvIHJhcm87Owo=

It looks OK, so this should be enough:

from io import BytesIO
import pandas as pd

df = pd.read_csv(BytesIO(self.file_to_import))

However, df does not have any rows, and if I check df.empty, it returns True, so the dataframe does not have any info. If I check the size of the BytesIO object before trying to convert it into a dataframe, it returns 1376 bytes, which seems to be OK, since Dolphin shows a size of 1,0 KiB (1.031) for the file.

x = BytesIO(self.file_to_import)
_logger.critical(x.getbuffer().nbytes)
df = pd.read_csv(x))

Can anyone tell me why is this happening? Why the dataframe is empty?

forvas
  • 9,801
  • 7
  • 62
  • 158
  • 1
    It looks like a `base64` string. Maybe, you need to decode it before to use: `base64.decodebytes(s)` – Corralien Apr 27 '21 at 15:51
  • @Corralien yes, that was the problem, I did not realize that fact... thank you very much! Convert your comment into an answer so I can set your answer as the right one. – forvas Apr 27 '21 at 15:58
  • 1
    Normally i would remove the odoo tags here, but odoo is converting files into base64, so i'm fine with the tags ;-) – CZoellner Apr 28 '21 at 08:54

1 Answers1

2

Your string is base64 encoded. You need to decode it before to use:

import base64

s = b"ZGVmYXVsdF9jb2RlO2...Jhcm87Owo="
s = base64.decodebytes(s)
Corralien
  • 109,409
  • 8
  • 28
  • 52