0

Is it possible to merge multiple blocks/lines into a "single" line? So basically if the next line starts with the same "#Msg" tag then append it to the previous line. (Hard to explain, but my example speaks for itself) (The blocks are separated by a new/blank line)

My input file looks like this:

#Msg,00000

#Msg,00001
#Msg,00002

#Msg,00003
#Msg,00004

#Msg,00005

#Msg,00006
#Msg,00007
#Msg,00008

#Msg,00009

#Msg,00010
#Msg,00011

Output should be like this:

#Msg,00000

#Msg,00001 #Msg,00002

#Msg,00003 #Msg,00004

#Msg,00005

#Msg,00006 #Msg,00007 #Msg,00008

#Msg,00009

#Msg,00010 #Msg,00011

Any advice is very welcome.

aristotll
  • 8,694
  • 6
  • 33
  • 53
  • 1
    Are you specifically tied to `sed` here? Have you made any attempt to solve this yourself or done any research? – Mad Physicist Dec 29 '17 at 22:19
  • I don't understand how the `Msg##` is used to group... In the example I see the groups being created based on whether there's a new line between them or not. Care to clarify a bit? – Savir Dec 29 '17 at 22:24
  • Mostly I use regex, but I failed here, so I did some research and most people using sed or perl or awk ..so I'm NOT tied to sed. – vollschauer Dec 29 '17 at 22:24
  • Yes the "groups" are separated by a new line... – vollschauer Dec 29 '17 at 22:25
  • `awk -v RS="" '{for(i=1;i<=NF;i++){printf("%s ",$i)}print"\n"}' file` gets you part of the way there. I would add a pipe that deletes the blank lines. Good luck. – shellter Dec 30 '17 at 03:53
  • Solved! Thanks everybody! – vollschauer Dec 30 '17 at 11:03
  • Possible duplicate of [Sed to combine N text lines separated by blank lines?](https://stackoverflow.com/questions/39734125/sed-to-combine-n-text-lines-separated-by-blank-lines) – PesaThe Dec 31 '17 at 12:20

5 Answers5

0

This would be pretty easy to do in Perl:

perl -00 -ple 'tr/\n/ /'

-e CODE specifies the program.

-p wraps a read/write line loop around it (by default it reads from STDIN, but you can also specify one or more filenames on the command line).

-00 specifies that the input "lines" are actually paragraphs.

-l has two effects: Incoming line terminators are automatically stripped from lines, and outgoing lines get line terminators added to them (and because we used -00 (paragraph mode), our line terminator is actually \n\n).

To recap:

We read the input one paragraph at a time. For each paragraph, we remove any trailing newlines. We then translate every newline to a space. Finally we output the transformed paragraph, followed by \n\n.

melpomene
  • 84,125
  • 8
  • 85
  • 148
0

No point in trying to produce a shorter code than is possible with Perl!

Collect lines from the input file in list group until a blank line appears. Then output the contents of group, empty it and start again. When end-of-file is encountered output whatever is in group, if it is non-empty.

group = []
with open('vollschauer.txt') as vollschauer:
    for line in vollschauer:
        line = line.rstrip()
        if line:
            group.append(line)
        else:
            if group:
                print (' '.join(group))
                print()
                group = []
if group:
    print (' '.join(group))
    group = []
Bill Bell
  • 21,021
  • 5
  • 43
  • 58
0
$ awk -v RS= -v ORS='\n\n' '{$1=$1}1' file
#Msg,00000

#Msg,00001 #Msg,00002

#Msg,00003 #Msg,00004

#Msg,00005

#Msg,00006 #Msg,00007 #Msg,00008

#Msg,00009

#Msg,00010 #Msg,00011
Ed Morton
  • 188,023
  • 17
  • 78
  • 185
0

If you insist on using sed, this should do the trick:

sed -r ':a; N; /^(#[^,]+,).*\n\1/! { P; D }; s/\n/ /; ba' file

It takes different tags into account. Such tags won't be grouped together (that's what I understood is the desired behavior):

$ cat file
#Msg,00000
#Msg,00001
#Hello,00002

#Hello,00003
#What,00004
#What,00005
$ sed -r ':a; N; /^(#[^,]+,).*\n\1/! { P; D }; s/\n/ /; ba' file
#Msg,00000 #Msg,00001
#Hello,00002

#Hello,00003
#What,00004 #What,00005

Note that this solution uses GNU sed.

PesaThe
  • 7,259
  • 1
  • 19
  • 43
0

This might work for you (GNU sed):

sed ':a;N;/^$/M!s/\n/ /;ta' file

Gather up lines, replacing each newline by a space until an empty line.

N.B. The use of the M flag on the repexp /^$/ which matches an empty line on a pattern space containing multiple lines.

potong
  • 55,640
  • 6
  • 51
  • 83