2

I've read a couple of other questions about this, but none of them seem to be working. I'm currently trying to split something like file A.txt using the delimiter "STOPHERE".

This is the code:

#!/bin/bash

awk 'BEGIN{
    RS = "STOPHERE"
    file = 0}
{
    file++
    print $0 > ("sepf" file)
}' A.txt

File A:

aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa      lwdjnuqqfqaaaaaaaaaa   qlknfqek fkgnl       efekfnwegelflfne
ldnwefne f STOPHEREsdfnkjnf nnnnnnnnnnnnnnnnnnnnnnnasd  fefffffffffffffflllo  

aldn3orn    STOPHERE

fknjke bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbowqff STOPHERE i
asfjfenf STOPHERE

Into these:

sepf1:

aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa      lwdjnuqqfqaaaaaaaaaa   qlknfqek fkgnl       efekfnwegelflfne
ldnwefne f 

sepf2:

sdfnkjnf nnnnnnnnnnnnnnnnnnnnnnnasd  fefffffffffffffflllo  

aldn3orn  

sepf3:

    #line starts here
fknjke bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbowqff 

sepf4:

 i
asfjfenf 

So basically, the formatting has to stay exactly the same between the STOPHERE.

But for some reason, this is the kind of output I'm getting in some of the files:

Eg: sepf2

TOPHEREsdfnkjnf nnnnnnnnnnnnnnnnnnnnnnnasd  fefffffffffffffflllo  

aldn3orn

Any ideas as to why the "TOPHERE" remains??

Nematode7
  • 127
  • 1
  • 1
  • 10

1 Answers1

0

GNU awk allows RS to be a regex. So you can provide multiple characters as a record separator. Your code can also be simplified as AWK provides a default value of 0. So this will generate separate files for each record.

awk -v RS="STOPHERE" '{print $0 > ("sepf" ++file)}'
Jay Rajput
  • 1,813
  • 17
  • 23