This time my working environment is MAC terminal but I am not very familiar with. I have a large GWAS results (>5G) in compressed gz format, say gwas.gz. what I want is to quick check the top rows (like head() in R) by ordering (ascending) the 22nd columns/fields of the file. I'm wondering what's the easiest way to realize that without uncompressed it.
Asked
Active
Viewed 66 times
0
-
1There are some common commands with a `z` in front for the equivalent with zipped files. You have `zcat`, `zgrep`... and `zmore`! This can be the answer. – fedorqui Jan 30 '15 at 14:37
-
@fedorqui, these tools would still need to decompress the file. – Super-intelligent Shade Jan 30 '15 at 14:39
-
Yep, obviously some decompression has to be made to read the file. However, I suppose they have been implemented in a way that optimizes it. I would say that `zmore` would keep decompressing by the time it needs more info, not all in one show. – fedorqui Jan 30 '15 at 14:42
-
@fedorqui, may you give me an example command line using 'gwas.gz'? – David Z Jan 30 '15 at 14:42
-
`zcat` `zgrep` `zmore` are all bash scripts, which internally call `gzip` and pipe its output to their respective command. so in a sense it's better than plain decompression :) – Super-intelligent Shade Jan 30 '15 at 14:44
-
@fedorqui, I need to order the 22nd filed/column of the file. – David Z Jan 30 '15 at 14:48
-
@David Z To learn about any of these commands presented, in a terminal, type e.g.: `man zmore` – user3439894 Jan 30 '15 at 14:48
-
@David Z if you need to order, then you definitely have to use `zcat` to process all the file and then pipe to `sort -k22` or something like that. – fedorqui Jan 30 '15 at 15:25