0

Have to write a script which updates the file in this way.

raw file:

<?blah blah blah?>
<pen>
<?pineapple?>
<apple>
<pen>

Final file:

<?blah blah blah?><pen>
<?pineapple?><apple><pen>

Where ever in the file if the new line charter is not followed by

<?

We have to remove the newline in order to append it at the end of previous line.

Also it will be really helpful if you explain how your sed works.

utkarsh tyagi
  • 635
  • 1
  • 9
  • 27
  • See http://stackoverflow.com/questions/1251999/how-can-i-replace-a-newline-n-using-sed. Adapt to your needs. – pringi Jan 03 '17 at 13:49

3 Answers3

0

Perl solution:

perl -pe 'chomp; substr $_, 0, 0, "\n" if $. > 1 && /^<\?/'
  • -p reads the input line by line, printing each line after changes
  • chomp removes the final newline
  • substr with 4 arguments modifies the input string, here it prepends newline if it's not the first line ($. is the input line number) and the line starts with <?.
choroba
  • 231,213
  • 25
  • 204
  • 289
0

Sed solution:

sed ':a;N;$!ba;s/\n\(<[^?]\)/\1/g' file > newfile

The basic idea is to replace every

\n followed by < not followed by ?

with what you matched except the \n.

MaanooAk
  • 2,418
  • 17
  • 28
0

When you are happy with a solution that puts every <? at the start of a line, you can combine tr with sed.

tr -d '\n' < inputfile| sed 's/<?/\n&/g;$s/$/\n/'

Explanation:
I use tr ... < inputfile and not cat inputfile | tr ... avoiding an additional catcall.
The sed command has 2 parts.
In s/<?/\n&/g it will insert a newline and with & it will insert the matched string (in this case always <?, so it will only save one character).
With $s/$/\n/ a newline is appended at the end of the last line.

EDIT: When you only want newlines before <? when you had them already, you can use awk:

awk '$1 ~ /^<\?/ {print} {printf("%s",$0)} END {print}'

Explanation:
Consider the newline as the start of the line, not the end. Then your question transposes into "write a newline when the line starts with <?. You must escape the ? and use ^ for the start of the line.

awk '$1 ~ /^<\?/ {print}'

Next print the line you read without a newline character. And you want a newline at the end.

Walter A
  • 19,067
  • 2
  • 23
  • 43