Total noob here with both bash and working with .eml files, so bare with me...
I have a folder with many saved .eml files, and I want a bash script (if this is not possible with bash, I'm willing to use python, or zsh, or maybe perl--never used perl before, but it may be good to learn) that will print the email content after a line containing a specific textual phrase, and before the next empty line.
I also want this script to combine consecutive lines ending in "=". (Lines that do not end with an "=" sign should continue printing on a new line.)
All of my testing with .txt files that I create manually work fine, but when I use an actual .eml file, then things stop working.
Here is a portion of a sample .eml file:
(.eml file continues above)
Content-Type: text/plain; charset="Windows-1252"
Content-Transfer-Encoding: quoted-printable
testing
StartLine (This is where stuff begins)
This is a line that should be printed.
This is a long line that should be printed. Soooooooooooooooooooooooooooooo=
Loooooooooooooooooooooooonnnnnnnnnggggg. Soooooooooooooooooooooooooooooo L=
oooooooooooooooooooooooonnnnnnnnnggggg. Soooooooooooooooooooooooooooooo Loo=
oooooooooooooooooooooonnnnnnnnnggggg.
This is where things should stop (no more printing)
Don=92t print me please!
Don=92t print me please!
Don=92t print me please!
[This message is from an external sender.]
(.eml file continues below)
I want the script to output:
This is a line that should be printed.
This is a long line that should be printed. Soooooooooooooooooooooooooooooo Loooooooooooooooooooooooonnnnnnnnnggggg. Soooooooooooooooooooooooooooooo Loooooooooooooooooooooooonnnnnnnnnggggg. Soooooooooooooooooooooooooooooo Loooooooooooooooooooooooonnnnnnnnnggggg.
Here is my script so far:
#!/bin/bash
files="/Users/username/Desktop/emails/*"
specifictext="StartLine"
for f in $files
do
begin=false
previous=""
while read -r line
do
if [[ -z "$line" ]] #this doesn't seem to be working right
then
begin=false
fi
if [[ "$begin" = true ]]
then
if [[ "${line:0-1}" = "=" ]] #this also doesn't appear to be working
then
previous=$previous"${line::${#line}-1}"
else
echo $previous$line
fi
fi
if [[ $line = "$specifictext"* ]]
then
begin=true
fi
done < "$f"
done
This will successfully skip everything up to and including the line containing $specifictext, but then it will print off the entire remainder of each email instead of stopping at the next empty line. Like this:
$ ./printeml.sh
This is a line that should be printed.
This is a long line that should be printed. Soooooooooooooooooooooooooooooo=
Loooooooooooooooooooooooonnnnnnnnnggggg. Soooooooooooooooooooooooooooooo L=
oooooooooooooooooooooooonnnnnnnnnggggg. Soooooooooooooooooooooooooooooo Loo=
oooooooooooooooooooooonnnnnnnnnggggg.
This is where things should stop (no more printing)
Don=92t print me please!
Don=92t print me please!
Don=92t print me please!
[This message is from an external sender.]
(continues printing remainder of .eml)
As you can see above, the other issue I'm having is that I wanted to get combine lines with "=" signs at the end, but that is not working. It appears all the testing I do with test files works fine, except when I use an actual .eml file. I think this is an issue with hidden characters in .eml files, but I'm not really sure how that works.
I'm using bash version 3.2.57(1) on MacOS 12.4.