0

I am downloading files in a cloud environment whereby I am given the file as a type bytes.

The formatting on the bytes file conflicts with my processing and thus I would like the bytes file to conform to the way Python handles this basic function:

f = open(r"file.log", "r").readlines()

Is there a way that I can massage a type(bytes) object to look and behave like the above f?

Edit:

The bytes file is NOT saved to disk. It is stored in memory. It looks like

type(downloaded_bytes) # bytes

This fails:

f = open(downloaded_bytes, "rb").readlines()

Edit 2:

These are not equivalent:

f = io.BytesIO(downloaded_bytes).readlines()
f2 = open(r"file.log", "r").readlines()
f == f2  # false
John Stud
  • 1,506
  • 23
  • 46
  • 1
    use "rb" to read a file as bytes: `open("file.log", "rb")` – Blackgaurd Mar 15 '22 at 20:20
  • 1
    @Blackgaurd, the OP's data is coming from a `bytes` object that's already in-memory, not from a file. (They've edited to make that even more explicit, but I thought it was clear even in the original version of the question). – Charles Duffy Mar 15 '22 at 20:26
  • 1
    Do you know the encoding of `downloaded_bytes`? Otherwise you can't reliably convert it to text (strings). – wjandrea Mar 15 '22 at 20:35

1 Answers1

2

This is what io.BytesIO is for:

import io

f = io.BytesIO(downloaded_bytes).readlines()

Alternately, since you're using 'r' instead of 'rb' in your file example, one might instead end up with something like:

import io

f = io.StringIO(downloaded_bytes.decode('utf-8')).readlines()

...presuming that UTF-8 is indeed the correct encoding.

Charles Duffy
  • 280,126
  • 43
  • 390
  • 441
  • The files are not equivalent, however. – John Stud Mar 15 '22 at 20:28
  • Right, a BytesIO is equivalent to opening with `'rb'` instead of just `'r'`. For equivalence to just `'r'` you need to decode the bytes into a `str` and use an `io.StringIO()`. – Charles Duffy Mar 15 '22 at 20:30
  • @JohnStud, err. Your "edit 2" came after the first edit to this question, which added a StringIO-based alternative. (Also, there's no reason to put edit markers in your question: The history is available for everyone to see, and it's more important to make the question easy-to-read for people who are seeing it for the first time than to make the text include a duplication of publicly-readable metadata for the benefit of people who already saw an older version: More people are going to be seeing the question for the first time in the future than have seen it in the past). – Charles Duffy Mar 15 '22 at 20:40
  • @JohnStud yes, they are. – juanpa.arrivillaga Mar 15 '22 at 20:53
  • @juanpa.arrivillaga, ...they're not equivalent in the code quoted in the question using `BytesIO`, but then, that's why this answer also demonstrates `StringIO` (which the OP seems to be ignoring for some reason). – Charles Duffy Mar 15 '22 at 21:18
  • @CharlesDuffy I understand that – juanpa.arrivillaga Mar 15 '22 at 21:42