-1

I have a file that looks something like this:

########## foo: foo
########## foo1: foo1
########## foo2: foo2
########## foo3: foo3
########## foo4: foo4
########## foo5: foo5
########## foo6: foo6
########## foo7: foo7
########## foo8: foo8
########## foo9: foo9
########## foo10: foo10
########## foo11: foo11
########## foo12: foo12
########## foo13: foo13

blah blah
blah blah 
... /repeats arbitrary number of times
blah blah

########## foo: foo
########## foo1: foo1
########## foo2: foo2
########## foo3: foo3
########## foo4: foo4
########## foo5: foo5
########## foo6: foo6
########## foo7: foo7
########## foo8: foo8
########## foo9: foo9
########## foo10: foo10
########## foo11: foo11
########## foo12: foo12
########## foo13: foo13
...

How can I remove all the blahs between the sets of ######### fields?

So that the file looks like

########## foo: foo
########## foo1: foo1
########## foo2: foo2
########## foo3: foo3
########## foo4: foo4
########## foo5: foo5
########## foo6: foo6
########## foo7: foo7
########## foo8: foo8
########## foo9: foo9
########## foo10: foo10
########## foo11: foo11
########## foo12: foo12
########## foo13: foo13

########## foo: foo
########## foo1: foo1
########## foo2: foo2
########## foo3: foo3
########## foo4: foo4
########## foo5: foo5
########## foo6: foo6
########## foo7: foo7
########## foo8: foo8
########## foo9: foo9
########## foo10: foo10
########## foo11: foo11
########## foo12: foo12
########## foo13: foo13

Is there a good way to do this using sed or awk or some other command in Linux. Or is it best to approach this using an external language like python or perl? Whatever works is fine by me.

Thanks!

dibery
  • 2,760
  • 4
  • 16
  • 25
skhan21
  • 11
  • 6
  • start with this regex pattern `^(?!\#).*$` it will find all the lines that dont start with #. not sure the full sed command – Josh Beauregard Dec 23 '19 at 16:44
  • 2
    Possible duplicate of [Using sed to delete all lines between two matching patterns](https://stackoverflow.com/q/6287755/608639), [SED delete lines between two pattern matches](https://stackoverflow.com/q/8085633/608639), [sed delete lines between two patterns, without the second pattern, including the first pattern](https://stackoverflow.com/q/42898905/608639), [SED delete specific lines between two patterns?](https://stackoverflow.com/q/19233578/608639), [Delete lines in a text file that contain a specific string](https://stackoverflow.com/q/5410757/608639) and friends. – jww Dec 23 '19 at 17:39

2 Answers2

0

A simple grep will do the job:

grep '############' in.txt > out.txt

you can add more search patterns to grep, if you want to keep more variants of lines with the -e option of grep. If you want to keep the empty lines, simply use

grep -e '############' -e '^$' in.txt > out.txt
Klaus
  • 24,205
  • 7
  • 58
  • 113
0

The challenge is to print only one blank line, even if there are multiple blank lines in the input line. For a proper solution a small state machine is necessary.

If a ##########-line is read, then it written to output and the state is set to 0. If not a ##########-line is read, then state changes to 1. If a ##########-line is read and the state is 1, then a blank line is written before the just read line is written to output.

awk '
# line starts with ###.
/^##########/ {
# if the previous was not ### then print empty line
if (state==1) print ""
# output current line
print $0
# change to state 0
state=0
}
# change to state 1, if line does not start with #
/^[^#]/ {
state=1
}
' test.txt
' test.txt
Aedvald Tseh
  • 1,757
  • 16
  • 31