1

Again, I have about 150 files with the following data with no header

x1 y1 z1

x2 y2 z2

...

zn yn zn

The delimiter happens to be tab key. How could I use sed and batch processing for these 150 files to achieve the following output:

x1

x2

x3

...

xn

y1

y2

y3

...

yn

z1

z2

z3

..

zn

Any ideas would be appreciated.

NOTE: I posted similar question before, not duplicate. Please see this link.

Regards,

ikel

Community
  • 1
  • 1
ikel
  • 518
  • 2
  • 6
  • 27

2 Answers2

1

I hope you are not allergic to perl...

This solution will work for files with any number of columns:

$ perl -ne 'BEGIN { @a = (); } $i = 0; foreach (split(/\s+/)) { $l = ($a[$i++] ||= []); push @$l, $_; }; END { print join("\n", @$_) . "\n" foreach (@a); }' << EOF
> x1 y1 z1
> x2 y2 z2
> x3 y3 z3
> x4 y4 z4
> EOF
x1
x2
x3
x4
y1
y2
y3
y4
z1
z2
z3
z4

I'll comment since this is not really obvious:

  • perl -n reads line by line (to be precise, it reads and splits against $/), and -e executes a scriptlet;
  • the BEGIN block is executed before the first input is read, the END block is executed last.

Anatomy:

BEGIN { @a = (); }         # Creates an array named "a"
# Main scriptlet
$i = 0;
foreach (split(/\s+/)) {   # Split an input line against one or more space chars
    $l =                   # Set $l to...
        ($a[$i++] ||= []); # what is at index i of @a (increment i), but if not set,
                           # set to an (empty) array ref and return that
    push @$l, $_;          # Push token to the end of the array ref
}
END {                      # End block...
    print join("\n", @$_)  # Print the contents of array references, joined with \n,
    . "\n"                 # then \n,
    foreach (@a);          # for each element of array a
}                          # DONE
fge
  • 119,121
  • 33
  • 254
  • 329
  • Never occurred to me that anyone would give me a _pe[a]rly_ suggestion. :D No, I am not allergic to perl. – ikel Jan 28 '13 at 08:13
1

I don't think sed is the best tool for this job. The simplest solution that comes to mind simply involves using cut three times:

cut -f1 file && cut -f2 file && cut -f3 file

Contents of file:

x1  y1  z1
x2  y2  z2
x3  y3  z3
xn  yn  zn

Results:

x1
x2
x3
xn
y1
y2
y3
yn
z1
z2
z3
zn

For batch processing your files assuming you only have the files of interest in your present working directory:

for i in *; do 
    cut -f1 "$i" >> "$i.bak"
    cut -f2 "$i" >> "$i.bak"
    cut -f3 "$i" >> "$i.bak"

    mv "$i.bak" "$i"
done
Steve
  • 51,466
  • 13
  • 89
  • 103
  • 1
    To be honest, when I saw 'cut' I said to myself "What is 'cut'?" Ah! if I had known how to use these tools properly, it would've saved me a lot of time. Again, thank you for sharing your simplistic and minimalistic approach. – ikel Jan 28 '13 at 08:32