3

I have a text file and I need to remove newlines if they are followed by the string "fox"

the 
brown 
fox 
jumps 

will become

the 
brown fox 
jumps 

I would like to do it in SED, but the AWK solution would be useful too.

BearCode
  • 2,734
  • 6
  • 34
  • 37
  • 2
    Possible duplicate of [Delete a line containing a specific string using sed](http://stackoverflow.com/questions/5410757/delete-a-line-containing-a-specific-string-using-sed) – alexander.polomodov Mar 26 '16 at 00:08
  • I think it's quite a distinct question. This is for deleting newlines (\n), not entire lines. And the decision is based on the content of the next line. – BearCode Mar 26 '16 at 22:00

3 Answers3

4

This might work for you (GNU sed):

sed ':a;N;/\nfox/s/\n//;ta;P;D' file

Read two lines into the pattern space and if the second line matches the criteria, remove the newline and repeat. The first line is always printed and then deleted. If the pattern space still has a line in it i.e. the criteria was not matched, another line is appended etc however if the line did meet the criteria the pattern space is empty and two lines will be read in as they would be such as at the beginning of the file.

potong
  • 55,640
  • 6
  • 51
  • 83
  • It works but not when having consecutive lines containing the string. Try doubling the line containing "fox" and it will only delete the first newline that it's supposed to delete – BearCode Mar 26 '16 at 07:37
  • `;P;D` can be removed - the result, as far as I can see, is identical. check it out: `sed ':a;N;/\nfox/s/\n//;ta' file` – aleksandr barakin Feb 05 '21 at 17:28
  • @aleksandrbarakin see comment above – potong Feb 06 '21 at 10:52
3

With Perl:

perl -0pe 's/\nfox/fox/g' file

Output:

the 
brown fox 
jumps 
Cyrus
  • 84,225
  • 14
  • 89
  • 153
2

This is not a job for sed, it is a job for awk:

$ awk 'NR>1{printf "%s", (/fox/ ? OFS : ORS)} {printf "%s", $0} END{print ""}' file
the
brown fox
jumps

The above replaces the newline (ORS) before fox with a blank char (OFS). Massage to suit...

With GNU awk you can alternatively reduce it to:

$ awk -v RS='^$' -v ORS= '{gsub(/\nfox/," fox")} 1' file
the
brown fox
jumps

or:

$ awk -v RS='\nfox' '{ORS=gensub(/\n/," ",1,RT)} 1' file
the
brown fox
jumps

but that reads the whole file into memory at one time.

Ed Morton
  • 188,023
  • 17
  • 78
  • 185