remove the columns with the same value

Question

I have a huge data set, and I want to remove the columns which have the same value of -9. I have 20 rows and million columns.

I want to remove column 1,9,11 (which are all same with the value of -9) this is just a small portion of my data set. I cannot count which columns have all -9 with my full data set.

Appreciate you help,

Thanks!

sample1 -9 3 -9 0 -9 2 -9 -9 -9 1 -9

sample2 -9 -9 0 -9 0 2 -9 -9 -9 -9 -9

sample3 -9 -9 -9 -9 -9 2 3 1 -9 -9 -9

sample4 -9 -9 -9 -9 -9 2 -9 1 -9 -9 -9

Output;

sample1 3 -9 0 -9 2 -9 -9 1

sample2 -9 0 -9 0 2 -9 -9 -9

sample3 -9 -9 -9 -9 2 3 1 -9

sample4 -9 -9 -9 -9 2 -9 1 -9

I tried to remove it with perl.

possible duplicate of [delete a column with awk or sed](http://stackoverflow.com/questions/15361632/delete-a-column-with-awk-or-sed) — fedorqui, Feb 10 '15 at 11:07
I don't want to delete specific columns, I want to remove the columns which are all -9, and I have a lot of columns. — mgg, Feb 10 '15 at 11:23
Then clarify your question by: 1) using the tools to print in a proper format 2) showing your desired output 3) indicating what you tried so far. See [How do I ask a good question](http://stackoverflow.com/help/how-to-ask) for further references. — fedorqui, Feb 10 '15 at 11:26
see http://stackoverflow.com/questions/28446556/how-to-print-columns-containing-value and adpat to your specific need — NeronLeVelu, Feb 12 '15 at 07:38

score 0 · Answer 1 · edited May 23 '17 at 12:20

You can try below awk script.

cat a.txt | awk '{
                  for (Col = 1; Col <= NF; Col++) { a[NR, Col] = $Col } 
                 }         
                 END {    
                   for (Col = 1; Col <= NF ; Col++) { 
                      j=0;  
                      for( Row = 1 ; Row<= NR; Row++) 
                      { 
                         if (a[Row,Col] != -9 ) 
                          { j=1 } 
                      } 
                      if( j == 0 ) { continue;} 
                      for(Row=1; Row<= NR; Row++){ 
                         printf a[Row,Col] (Row==NR ? RS : FS)  
                      } 
                    } 
                }' | awk '{
                  for (Col = 1; Col <= NF; Col++) { a[NR, Col] = $Col } 
                 }         
                 END {    
                   for (Col = 1; Col <= NF ; Col++) { 
                      for(Row=1; Row<= NR; Row++){ 
                         printf a[Row,Col] (Row==NR ? RS : FS)  
                      } 
                    } 
                }'

Test:

INPUT:
-9 1 2 2 -9 0
-9 -9 2 5 -9 3
-9 2 3 3 -9 -9
-9 7 4 6 -9 4
OUTPUT:
1 2 2 0
-9 2 5 3
2 3 3 -9
7 4 6 4

Steps:

Transpose matrix (then each column will become row)
Then exclude row which has got all -9 values
Again transpose matrix

Reference : link

Great thanks. It works for the small data, but for my full data set it's not working!! — mgg, Feb 11 '15 at 11:22
My file is too big, so I cannot use cat. What can I use instead of cat? — mgg, Feb 11 '15 at 11:34
I agree, above approach is not efficient for very large file. I think we need to scan file twice in script/program for handling big data. — Shubhangi, Feb 12 '15 at 14:57