I am trying to extract the To header from an email file using sed on linux.
The problem is that the To header could be on multiple lines.
e.g:
To: name1@mydomain.org, name2@mydomain.org,
name3@mydomain.org, name4@mydomain.org,
name5@mydomain.org
Message-ID: <46608700.369886.1549009227948@domain.org>
I tried the following:
sed -n -e '/^[Tt]o: / { N; p; }' _message_file_ |
awk '{$1=$1;printf("%s ",$0)};NR%2==0{print ""}'
The sed command extracts the line starting with To and next line. I pipe the output to awk to put everything on a single line.
The full command outputs in one line:
To: name1@mydomain.org, name2@mydomain.org, name3@mydomain.org, name4@mydomain.org
I don't know how to keep going and test if the next line starts with whitespace and add it to the result.
What I want is all the addresses
To: name1@mydomain.org, name2@mydomain.org, name3@mydomain.org, name4@mydomain.org, name5@mydomain.org
Any help will be appreciated.