1

I'm having difficulty with the following code (which is simplified from a larger application I'm working on in Python).

from io import StringIO
import gzip

jsonString = 'JSON encoded string here created by a previous process in the application'

out = StringIO()
with gzip.GzipFile(fileobj=out, mode="w") as f:
    f.write(str.encode(jsonString))

# Write the file once finished rather than streaming it - uncomment the next line to see file locally.
with open("out_" + currenttimestamp + ".json.gz", "a", encoding="utf-8") as f:
    f.write(out.getvalue())

When this runs I get the following error:

File "d:\Development\AWS\TwitterCompetitionsStreaming.py", line 61, in on_status
    with gzip.GzipFile(fileobj=out, mode="w") as f:
  File "C:\Python38\lib\gzip.py", line 204, in __init__
    self._write_gzip_header(compresslevel)
  File "C:\Python38\lib\gzip.py", line 232, in _write_gzip_header
    self.fileobj.write(b'\037\213')             # magic header
TypeError: string argument expected, got 'bytes'

PS ignore the rubbish indenting here...I know it doesn't look right.

What I'm wanting to do is to create a json file and gzip it in place in memory before saving the gzipped file to the filesystem (windows). I know I've gone about this the wrong way and could do with a pointer. Many thanks in advance.

Arty
  • 14,883
  • 6
  • 36
  • 69
Mat Richardson
  • 3,576
  • 4
  • 31
  • 56
  • 2
    Almost duplicate of [Python 3 In Memory Zipfile Error. string argument expected, got 'bytes' - Stack Overflow](https://stackoverflow.com/questions/32075135/python-3-in-memory-zipfile-error-string-argument-expected-got-bytes) (`ZipFile` versus `GzipFile`) -- but the answer is the same. Understand the distinction between str and bytes in Python 3 clearly, then use the correct one. – user202729 Jan 29 '21 at 16:15
  • There are also [python 3.x - TypeError: string argument expected, got 'bytes' - Stack Overflow](https://stackoverflow.com/questions/62903409/typeerror-string-argument-expected-got-bytes) and [python - string argument expected, got 'bytes' in buffer.write - Stack Overflow](https://stackoverflow.com/questions/50797043/string-argument-expected-got-bytes-in-buffer-write) – user202729 Jan 29 '21 at 16:17

1 Answers1

1

You have to use bytes everywhere when working with gzip instead of strings and text. First, use BytesIO instead of StringIO. Second, mode should be 'wb' for bytes instead of 'w' (last is for text) (samely 'ab' instead of 'a' when appending), here 'b' character means "bytes". Full corrected code below:

Try it online!

from io import BytesIO
import gzip

jsonString = 'JSON encoded string here created by a previous process in the application'

out = BytesIO()
with gzip.GzipFile(fileobj = out, mode = 'wb') as f:
    f.write(str.encode(jsonString))
    
currenttimestamp = '2021-01-29'

# Write the file once finished rather than streaming it - uncomment the next line to see file locally.
with open("out_" + currenttimestamp + ".json.gz", "wb") as f:
    f.write(out.getvalue())
Arty
  • 14,883
  • 6
  • 36
  • 69