26

How can I delete some columns from a tab separated fields file with awk?

c1 c2 c3 ..... c60

For example, delete columns between 3 and 29 .

Brian Tompsett - 汤莱恩
  • 5,753
  • 72
  • 57
  • 129
user951487
  • 845
  • 7
  • 19
  • 30
  • This answer on stackoverflow may help you: http://stackoverflow.com/questions/2626274/awk-print-all-other-columns-but-not-1-2-and-3 – iwg Sep 26 '11 at 06:24

4 Answers4

45

This is what the cut command is for:

cut -f1,2,30- inputfile

The default is tab. You can change that with the -d switch.

Stephen Darlington
  • 51,577
  • 12
  • 107
  • 152
  • I had to remove the last `-` in order to make it work in Ubuntu. If I leave it, `cut` would print all the columns. Anyone had this problem too? – Adri C.S. Jul 05 '13 at 08:48
  • It should print columns one, two and thirty to the last one (60 in the question). If it doesn't that's a bug in Ubuntu! – Stephen Darlington Jul 05 '13 at 10:17
  • 1
    Aaaah, ok. I made a mistake. My bad. – Adri C.S. Jul 05 '13 at 11:23
  • how to remove a specific column, for example the 3rd? – a06e Mar 08 '16 at 11:04
  • @becko Exactly the same way as removing a range of columns. What exactly are you having difficulty with? – Stephen Darlington Mar 08 '16 at 11:25
  • I had trouble understanding your answer first, but now I realize that the argument to `cut` is the column numbers that you *keep* (I thought it was what you cut away). Thanks anyway – a06e Mar 08 '16 at 11:33
  • @user1436187 What does that mean? Which version of `cut` were you using? What was the problem that seemed to occur, and at what number of columns did it manifest? – underscore_d Oct 08 '16 at 14:16
  • 3
    @becko There's a common extension for that, `--complement`, which does what it says with the input field numbers, e.g.: `cut --complement -f3`. – underscore_d Oct 08 '16 at 14:16
  • How about `cut -f 3-29 --complement inputfile` Its more intuitive and humane, since it mentions exactly what is to be removed, rather than tell what should be left post processing. – kvaibhav Jan 25 '17 at 09:09
  • @kvaibhav Yes, that's what @underscore_d was suggesting. My guess is that `--complement` is a GNU extension. It's not present on the BSD version on my Mac for example. – Stephen Darlington Jan 25 '17 at 09:33
13

You can loop over all columns and filter out the ones you don't want:

awk '{for (i=1; i<=NF; i++) if (i<3 || i>29) printf $i " "; print""}' input.txt

where the NF gives you the total number of fields in a record.
For each column that meets the condition we print the column followed by a space " ".


EDIT: updated after remark from johnny:

awk -F 'FS' 'BEGIN{FS="\t"}{for (i=1; i<=NF-1; i++) if(i<3 || i>5) {printf $i FS};{print $NF}}' input.txt

this is improved in 2 ways:

  • keeps the original separators
  • does not append a separator at the end
oliver
  • 9,235
  • 4
  • 34
  • 39
  • Shouldn't you print a tab (\t) instead of a space. He wants to remove fields, perhaps not remove tabs at the same time (if I understand you correctly). – johnny Sep 26 '11 at 10:31
  • @johnny: you are right. I updated the code so it should consider the separator correctly. – oliver Sep 26 '11 at 11:52
  • Lets assume based on your edited answer that we want to delete column number 2,5,7,8,23,45,67,254,554,488. What would be condition for this ? I have file with almost 4000 columns. @oliver – Mayur Mahajan May 16 '19 at 08:53
1
awk '{for(z=3;z<=15;z++)$z="";$0=$0;$1=$1}1'

Input

c1 c2 c3 c4 c5 c6 c7 c8 c9 c10 c11 c12 c13 c14 c15 c16 c17 c18 c19 c20 c21

Output

c1 c2 c16 c17 c18 c19 c20 c21
Zombo
  • 1
  • 62
  • 391
  • 407
  • This doesn't delete columns. It blanks them and reprints... with OP's specified `O*FS` of `\t` replaced by a single space, which they didn't ask for. The _apparent_ deletion is coincidental and needs `FS` and `OFS` to be the default `\s+`. A pretty useless separator and incompatible with OP's `\t`, unless their file coincidentally can't have empty fields, as it'd squash them into jagged rows. Any other separator, e.g. OP's `\t`, gives output that still has the unwanted columns, but now empty. And `$0 = $0` is redundant and may be wasteful. The documented method to rebuild a record is `$1 = $1` – underscore_d Oct 08 '16 at 14:38
0

Perl 'splice' solution which does not add leading or trailing whitespace:

perl -lane 'splice @F,3,27; print join " ",@F' file

Produces output:

c1 c2 c30 c31
Chris Koknat
  • 3,305
  • 2
  • 29
  • 30