0

I'm having quite a lot of trouble trying to decompress gzipped data in django. I've tried a number of the solutions proposed in Download and decompress gzipped file in memory? but i think i'm running into difficulty in how it interacts with Django

I'd like to be able to upload data.csv.gz and then if it is a gzip, extract out the compressed data into a django File to continue along its routine (Saving to FileField)

What I have so far in my serializer

    def create(self, validated_data):
            file: File = validated_data.get("file")
            ext = file.name.split(".")[-1].lower()

            if ext == "gz":
                compressedFile = io.BytesIO()
                compressedFile.write(file.read())

                decompressed_fname = file.name[:-3]
                decompressedFile = gzip.GzipFile(fileobj=compressedFile)
                with open(decompressed_fname, "wb") as outfile:
                    outfile.write(decompressedFile.read())

                with open(decompressed_fname, "rb") as outfile:
                    file = File(outfile)
                    ext = decompressed_fname.split(".")[-1].lower()
...

When I do this, outfile is empty when I check its contents on disk, and throws an error in later routines

    f.seek(0)
ValueError: seek of closed file

I get a similar error if I use shutil instead too

            if ext == "gz":
                compressedFile = io.BytesIO()
                compressedFile.write(file.read())

                decompressed_fname = file.name[:-3]
                import shutil
                shutil.copyfileobj(gzip.GzipFile(fileobj=file), open(decompressed_fname, "wb"))

                with open(decompressed_fname, "rb") as outfile:
                    file = File(outfile)
                    ext = decompressed_fname.split(".")[-1].lower()

the curl command i'm using:

curl http://0.0.0.0:8000/upload/ -X 'POST' -H "Content-Encoding: gzip" -F "input_type=data" -F "file=@data.csv.gz"
Dennis
  • 21
  • 6

1 Answers1

0

I got it working with this:

            if ext == "gz":
                compressed_file = file.open()

                decompressed_fname = file.name[:-3]
                decompressedFile = gzip.GzipFile(fileobj=compressed_file)

                f = io.BytesIO()
                f.write(decompressedFile.read())

                file = File(f, name=decompressed_fname)
                ext = decompressed_fname.split(".")[-1].lower()

the issue was that file is an InMemoryUploadedFile and has to be opened first. I don't quite understand though

Dennis
  • 21
  • 6