0

I have a file like the following:

Header1:value1|value2|value3|
Header2:value4|value5|value6|

The column number is unknown and I have a function which can return the column number.

And I want to write a script which can remove one column from the file. For exampple, after removing column 1, I will get:

Header1:value2|value3|
Header2:value5|value6|

I use cut to achieve this and so far I can give the values after removing one column but without the headers. For example

value2|value3|
value5|value6|

Could anyone tell me how can I add headers back? Or any command can do that directly? Thanks.

HoldOffHunger
  • 18,769
  • 10
  • 104
  • 133
John
  • 585
  • 1
  • 9
  • 21
  • Did a quick search. This http://stackoverflow.com/questions/2626274/awk-print-all-other-columns-but-not-1-2-and-3 should help you. – mamboking Jul 06 '12 at 14:22

8 Answers8

2

Replace the colon with a pipe, do your cut command, then replace the first pipe with a colon again:

sed 's/:/|/' input.txt | cut ... | sed 's/|/:/'

You may need to adjust the column number for the cut command, to ensure you don't count the header.

chepner
  • 497,756
  • 71
  • 530
  • 681
  • Hi, thanks for your help. Really good idea, I didn't think in that way before, I struggled to add headers back to what I have got. I tried your method. And it works. But it seems need a little correction, `sed 's/:/|/'`. I am new to shell script, so not sure if it just works for me (I am using ksh). – John Jul 06 '12 at 15:01
  • Yeah, I don't use `sed` often and did not test this before posting. – chepner Jul 06 '12 at 15:04
1

Turn the ':' into '|', so that the header is another field, rather than part of the first field. You can do that either in whatever generates the data to begin with, or by passing the data through tr ':' '|' before cut. The rest of your fields will be offset by +1 then, but that should be easy enough to compensate for.

twalberg
  • 59,951
  • 11
  • 89
  • 84
  • Correct, but in the example given, there appears to be only one. If there's the possibility of more, `sed` or `awk` may be more appropriate. – twalberg Jul 06 '12 at 14:34
1

Your problem is that HeaderX are followed by ':' which is not the '|' delimiter you use in cut.

You could separate first your lines in two parts with :, with something like "cut -f 1 --delimiter=: YOURFILE", then remove the first column and then put back the headers.

MutoKenji
  • 46
  • 2
1

awk can handle multiple delimiters. So another alternative is...

jkern@ubuntu:~/scratch$ cat ./data188 
Header1:value1|value2|value3|
Header2:value4|value5|value6|
jkern@ubuntu:~/scratch$ awk -F"[:|]" '{ print $1 $3 $4 }' ./data188 
Header1value2value3
Header2value5value6
John
  • 476
  • 2
  • 5
  • 15
  • 1
    +1 - `awk -F"[:|]" 'BEGIN {OFS = "|"} { print $1 ":" $3, $4 }' ./data188` will restore the delimiters. Also, please shorten your prompts when posting. "$" is enough. It helps readability. – Dennis Williamson Jul 06 '12 at 16:30
0

you can do it just with sed without cut:

sed 's/:[^|]*|/:/' input.txt
rush
  • 2,484
  • 2
  • 19
  • 31
0

My solution:

$ sed 's,:,|,' data | awk -F'|' 'BEGIN{OFS="|"}{$2=""; print}' | sed 's,||,:,'
Header1:value2|value3|
Header2:value5|value6|
  • replace : with |
  • -F'|' tells awk to use | symbol as field separator
  • in each line we replace 2nd (because header now becomes first) field with empty string and printing result line with new field separator (|)
  • return back header by replacing first | with :

Not perfect, but should works.

Slava Semushin
  • 14,904
  • 7
  • 53
  • 69
0

$ cat file.txt | grep 'Header1' | awk -F"1" '{ print $1 $2 $3 $4}'

This will print all values in separate columns. You can print any number of columns.

kukido
  • 10,431
  • 1
  • 45
  • 52
0

Just chiming in with a Perl solution:
(rearrange/remove fields as needed)

-l effectively adds a newline to every print statement
-a autosplit mode splits each line using the -F expression into array @F
-n adds a loop around the -e code
-e your 'one liner' follows this option

$ perl -F[:\|] -lane 'print "$F[0]:$F[1]|$F[2]|$F[3]"' input.txt
AAAfarmclub
  • 2,202
  • 1
  • 19
  • 13