0

I want to read data from a GZIP dataset file directly the internet without downloading the complete file. Considering the size of the dataset, is it possible in python to stream the data directly from the server through HTTP and read the data? I took a look at zlib and gzip packages. I'm a beginner to python, I want to know if this is possible using python or any other language, if possible any references to such code. Thanks in Advance!

  • That should be possible in any programming language. If you want to seek to a certain position, that needs to be supported by the server, but streaming the file in chunks is just a matter of reading from the socket/http stream in chunks. – C. Yduqoli Jan 16 '19 at 01:48
  • Thanks @C. Yduqoli Can I do it without able to modify the server side? Like I just want to read from a giant zip file from a 3rd party server just over HTTP ( I cannot change anything at the server side). If yes, can anybody point me to a reference in python ? – Steve George Jan 23 '19 at 12:36
  • See here on how to read zip streams: https://stackoverflow.com/questions/29375201/streaming-decompression-of-zip-archives-in-python However, zip is not really made for streaming. Reading gzipped csv files, for example, would be easier and would not require any server support. Zip files probably require HTTP range request server support. – C. Yduqoli Jan 24 '19 at 01:16
  • Thanks @C.Yduqoli , It in fact is gzip data. I've been trying to do it for a while. I'm updating the question a bit. – Steve George Jan 26 '19 at 19:01

0 Answers0