0

Am running GnuWin32 under Windows 7. Have file with this structure:

|<text_0>
<text_1>
<text_2>
  until
<text_16>
|<text_0>
<text_1>
<text_2>
  until
<text_12>
|<text_0>
<text_1>
<text_2>
  until
<text_31>

< more of the same > 

There is a variable number of lines between lines that begin with the pipe (the separator symbol).

Desired output:

|<text_0><text_1><text_2>  until <text_16>
|<text_0><text_1><text_2>  until <text_12>
|<text_0><text_1><text_2>  until <text_31>

In Windows (therefore double quotes) have tried (from aypal singh and Ed Morton)

awk "{ ORS = (NR%2 ? FS : RS) } 1" < in.txt > out.txt

But this does not "skip" appending a line to the previous line if the line begins with a pipe.

How can I amend the awk program to append all lines to the previous line until awk encounters the record separator pipe (and continue processing until the end of the file)?

Community
  • 1
  • 1
Jay Gray
  • 1,706
  • 2
  • 20
  • 36

1 Answers1

5

You can say:

$ awk -v RS="|" '{$1=RS$1} NF>1' a
|<text_0> <text_1> <text_2> until <text_16>
|<text_0> <text_1> <text_2> until <text_12>
|<text_0> <text_1> <text_2> until <text_31>

This sets the record separator to the pipe | and then refactors all the line with the $1=$1 expression. But as you want a pipe in front of each line, we prepend the RS in this assignement. Then, we evaluate NF>1, so that not-empty lines are printed.

fedorqui
  • 275,237
  • 103
  • 548
  • 598
  • 1
    Perfect; TY. I did not know how to do the refactoring with $1=RS$1. – Jay Gray Feb 13 '15 at 13:26
  • 3
    I like the idea and only have the nitpick that whitespaces are folded into `OFS` this way. If that's a problem, `awk -v RS=\| 'NR > 1 { gsub(/\n/, ""); print RS $0 }'` preserves them. Still, one up. – Wintermute Feb 13 '15 at 13:28
  • 1
    I was going to raise the same concern as @Wintermute but my solution is just to set the FS appropriately: `awk -v RS='|' -F'\n[[:space:]]*' '{$1=RS$1} NF>1' file` since that's what you REALLY want this script to do - replace all newlines followed by white space (`FS`) within records with a blank char (`OFS`). – Ed Morton Feb 13 '15 at 20:10