1

I have a file with the following lines:

string
string
string
MODEL 1
.
.
.
TER
string 
string
string
MODEL 2
.
.
.
TER

where there are 5000 such MODELs. I want to split this file such that each section beginning MODEL X and ending TER (shown with dots) is saved to its own file, and everything else is discarded. How can I do this? Possibly with awk or split?

I have checked a couple of other similar questions, but failed to apply the answers to my case.

Also note that I use Mac OS X.

Toby Speight
  • 27,591
  • 48
  • 66
  • 103
sodiumnitrate
  • 2,899
  • 6
  • 30
  • 49

2 Answers2

2

You can use this awk for this:

awk '/^MODEL/{file="model" $2} file{print > file} /^TER/{close(file); file=""}' file

How it works:

/^MODEL/               # match lines starting with MODEL
file="model" $2        # make variable file as model + model_no from column 2
file{...}              # execute of file variable is set
{print>file}           # print each record to file
/^TER/                 # match lines starting with TER
{close(file); file=""} # close file and reset file to ""

Then verify as:

cat model1
MODEL 1
.
.
.
TER

cat model2
MODEL 2
.
.
.
TER
anubhava
  • 761,203
  • 64
  • 569
  • 643
1

This works even with dash:

go=false text= model_ID=
while IFS= read line; do
    if   [ "`printf "$line" | grep '^MODEL'`" ]; then
        model_ID="`printf "$line" | sed -e 's/^MODEL //'`"
        go=true
    elif [ "`printf "$line" | grep '^TER'`" ];   then
        printf "$text" > "MODEL_$model_ID"
        text=""
        model_ID=""
        go=false
    else
        $go && text="$text$line\n"
    fi
done
theoden8
  • 773
  • 7
  • 16