Questions tagged [compressed-files]
18 questions
103
votes
4 answers
Read lines from compressed text files
Is it possible to read a line from a gzip-compressed text file using Python without extracting the file completely? I have a text.gz file which is around 200 MB. When I extract it, it becomes 7.4 GB. And this is not the only file I have to read. For…

delete_this_account
- 2,376
- 7
- 23
- 31
6
votes
2 answers
How to validate multi part compressed (i.e zip) files have all parts or not in C#?
I want to validate multipart compressed files like Zip because when any part missing for compressed files then it raises an error, but I want to validate it before extraction and different software creates a different naming structure.
I also refer…

Hiren Jasani
- 258
- 2
- 13
5
votes
1 answer
julia: how to read a bz2 compressed text file
In R, I can read a whole compressed text file into a character vector as
readLines("file.txt.bz2")
readLines transparently decompresses .gz and .bz2 files but also works with non-compressed files. Is there something analogous available in julia? …

Ott Toomet
- 1,894
- 15
- 25
4
votes
1 answer
How does git perform on compressed files?
I have some svg files that I want to be tracked by git.
However, most software can transparently deal with svgz (which is basically svg.gz).
Therefore, I was considering switching to svgz to save disk space.
What are the pros and cons of having them…

norok2
- 25,683
- 4
- 73
- 99
3
votes
2 answers
Regex pattern that recognises file extension in Bash script not accurate to capture compressed files
I created this little Bash script that has one argument (a filename) and the script is supposed to respond according to the extension of the file:
#!/bin/bash
fileFormat=${1}
if [[ ${fileFormat} =~ [Ff][Aa]?[Ss]?[Tt]?[Qq]\.?[[:alnum:]]+$ ]]; then
…

msimmer92
- 397
- 3
- 16
2
votes
1 answer
Does Mosaic supports ingesting compressed data?
We have a scenario of uploading compressed files into Blob container in Microsoft Azure and then read it.
Is it possible in Mosaic to do it and if yes, what is the way to achieve it?
We have files in .gz format.

Pooja Bist
- 257
- 1
- 5
2
votes
3 answers
Is there a Java equivalent of GetCompressedFileSize?
I am looking to get accurate (i.e. the real size on disk and not the normal size that includes all the 0's) measurements of sparse files in Java.
In C++ on Windows one would use GetCompressedFileSize. I have yet to come across how one would go about…

J C
- 73
- 10
1
vote
0 answers
How to update lines in a compressed file
I have a large compressed .gz file (gigabyte level) that contains one large .txt file. Currently, I can add new lines to the end of the file without any problem. But I wonder if there's any way to update or modify lines as we iterate through the…

Pedram
- 2,421
- 4
- 31
- 49
1
vote
0 answers
using zcat to pass a gzipped file to a tool with options
A question like this was asked here, but never answered.
I need to pass fastq files as options to a tool that does not accept gzipped inputs. Is there really no option other than to unzip every one of them?
It fails when I pass the gzipped…

Jess
- 186
- 3
- 13
1
vote
1 answer
file compressed through command "pv" are different from ordinary compressed file
here is my script:
tar cf - testdir | pv -s $(du -sb testdir | awk '{print $1}') | pigz -1 > pv.tar.gz
tar cf - testdir | pigz -1 > nopv.tar.gz
diff pv.tar.gz nopv.tar.gz
and then the output is "Binary files pv.tar.gz and nopv.tar.gz differ".
I…

CJD
- 185
- 1
- 2
- 8
1
vote
2 answers
Are files in a .zip file always compressed?
At work I'm implementing a new webservice that works with files. The specifications say that we should not accept .zip files if they are compressed.
Is there such a thing as a not compressed .zip file? If yes, what do you think would be the best way…

Pistacchio
- 445
- 4
- 15
1
vote
1 answer
Find out MIME Type of compressed files downloaded from S3 for Java
A client is supposed to upload a compressed file into an S3 folder. Then the compressed file is downloaded and decompressed to perform various operations on its contained files. Originally we told our client to compress its files into a ZIP file,…

cavpollo
- 4,071
- 2
- 40
- 64
0
votes
2 answers
How to read_csv a zstd-compressed file using python-polars
In contrast to pandas, polars doesn't natively support reading zstd compressed csv files.
How can I get polars to read a csv compressed file, for example using xopen?
I've tried this:
from xopen import xopen
import polars as pl
with…

Cornelius Roemer
- 3,772
- 1
- 24
- 55
0
votes
1 answer
R: Download Compressed File from Github ".tsv.gz" on a Mac
I have been unable to download a compressed ".tsv.gz" file from GitHub using R at the following URL on a Mac.
https://github.com/hadley/mastering-shiny/blob/main/neiss/injuries.tsv.gz
This should be a routine command, but download.file() and nothing…

Michael Lachanski
- 57
- 7
0
votes
0 answers
How can I read 3GB of Gzip Compressed which goes more than 40GB after Extracting
I have been trying to read 3GB of gzip file. I have extracted it with gzip but after extracting it cannot even fit in 60GB of storage. So If I can't extract which is JSON so I can't read in bytes. I have found many questions all those were…

Adarsh Raj
- 325
- 4
- 17