2

after browsing/using the solutions on this great site for some time, it is finally time for me to participate.

I have a pretty clear concept of what i want, but am searching for the nicest way to get there.

What do i want?:

For some time now, I use a email-server setup on a raspberry-pi, it works great so far. It consists of a dovecot server and some sieve filters set up to sort my many email addresses into separate inbox subdirectorys. There is also a spam-filter who gets taught the difference between ham&spam every night by a script. (Basically he gets taught that the spam is in the junk folder, and every other folder contains ham)

I would like to replicate this behavior for a dedicated "Newsletter" folder. This folder contains no urgent messages that need to be immediately viewed or reported.

The plan is to manually put emails into the "news" folder and have a script scan this folder once per day. If it finds a email from a address which has no sieve rule for, it should create a rule to automatically put the mails from this address into the "news" folder on arrival.

Steps to realization?:

  • For this the script would need to scan the existing .dovecot.sieve file, extract the addresses from the "news-folder" rule to a seperate file or object for comparing.

    /*Example of a sieve filter:*/
    
    require "fileinto";
    
     /* Global Spam Filter */
    if anyof (header :contains "subject" "*SPAM*",
              header :contains "X-Spam-Flag" "YES" ) {
      fileinto "Junk";
      stop;
    }
    
    /* LAN Emails Filter */
      elsif address :is "to" "lan@docbrown.pi" {
      fileinto "INBOX.Lokal";
      stop;
    }
    
    /* Newsletter Filter */
      elsif anyof (address :is "from" "newsletter@example.com",
                   address :is "from" "news@yahoo.de",
                   address :is "from" "info@mailbox.de",
                   address :is "from" "something@somewhere.de") {
      fileinto "INBOX.Newsletter";
      stop;
    }
    
     /* gmail Account Filter */
      elsif address :is "to" "docbrown@gmail.com" {
      fileinto "INBOX.gmail";
      stop;
    }
    
     /* Yahoo Account Filter */
      elsif address :is "to" "docbrown@yahoo.de" {
      fileinto "INBOX.yahoo";
      stop;
    }
    
      else {
      # The rest goes into INBOX
      # default is "implicit keep", we do it explicitly here
      keep;
    }
    
  • Then it would need to process all emails in the maildir directory of the "news" folder and search in the emails for the "From: " field and the email address enclosed in the pointy brackets

    Date: Mon, 4 Nov 2013 16:38:30 +0100 (CET)
    From: Johannes Ebert - Redaktion c't <infoservice@heise.de> 
    To: docbrown@example.de
    
  • compare them with the extracted addresses from the sieve file and if the address has no filter rule
    (e.g. is not found in the list) create one for it (or simply add it to the extracted addresses)

  • after all emails are processed a new ruleset for the "news" folder would be created with the
    extracted_email_addresses-file and the existing dovecot.sieve would be replaced by a new one(the old
    one would be copied before, just in case)
  • maybe a dovecot restart would also be needed afterwards to read in the new rules?

Progress so far:

I tried to get this to work by simply using bash commands and utilities. This got me close to a point where i could almost extract the email addresses from the dovecot.sieve file, but it was pretty complicated for my taste and took some time.

#!/bin/sh

cp /home/mailman/.dovecot.sieve /home/mailman/autosieve/dovecot.sieve_`date +backup_%d%m%Y`
#echo "" > search.txt

X=grep -n "Newsletter Filter" /home/mailman/.dovecot.sieve #get rule start line number, some magic needs to happen here to just apply the numbers and not the full output by grep
Y=grep -n "INBOX.Newsletter" /home/mailman/.dovecot.sieve #get rule end line number
$X++  #increment to go into the next line
$Y--  #decrement to go into the previous line
sed -n ‘$X,$Yp’ /home/mailman/.dovecot.sieve > /home/mailman/search.txt  #copy lines into separate search_file
less /home/mailman/search.txt | awk -F '"' '{ if ($2 != "") print $4 }' > /home/mailman/adressen.txt # filter addresses and export to separate file

So I wondered if i could not get there easier, by maybe using python. I tinkered with it in another raspberry project but did not have the time to fully immerse into the python universe.

So i would be happy for a bit help/advice/pointing into the right direction here.

Till now i found some solutions for a similar problem (for the first part) where a extraction was needed, but i could not fully adapt it, or made some mistakes as i could not execute the script.

#!/usr/bin/python

file = open("dovecot.sieve", "r")

rule = {}
current_rule = None

for line in file:
    line = line.split()

    if (line[2] == "INBOX.Newsletter"):
        break
    if (line[1] == "/* Newsletter Filter */"):
        current_rule = rule.setdefault('Newsletter', [])
        continue
    if (line[5] == "from"):
        current_rule.append(line[6])
        continue
    if (line[3] == "from"):
        current_rule.append(line[4])
        continue


file.close()

# Now print out all the data
import pprint
print "whole array"
print "=============================="
pprint.pprint(rule)
print 
print "addresses found"
print "=========================="
pprint.pprint(rule['Newsletter'])

Could someone also recommend a IDE for python, with a debugger and so on? Eclipse would come to my mind, or is there anything else (maybe not so resource hungry)?

DocBrown
  • 21
  • 5

1 Answers1

0

Ok, so I got some spare time to conquer my own question. Did some digging around and read some code snippets and tested it out in Eclipse with Pydev.

Now i run this script as a cron job at night.

What does it do?

It collects all the email addresses in the dovecot.sieve file (well the ones in the "Newsletter" ruleset). Then looks in the INBOX.Newsletter folder for any not registered email addresses by comparing them with the collected addresses. If it finds a new address it saves a copy of the old sieve file and then rewrites the existing file. The new email addresses are inserted into the "Newsletter" ruleset so these emails get redirected into the designated Newsletter folder.

#!/usr/bin/python2.7

import os, sys
#Get the already configured email senders...
addresses = {}
current_addresses = None

with open("/home/postman/.dovecot.sieve", "r") as sieveconf:
    for line in sieveconf:
        if "INBOX.Newsletter" in line:
            break

        if "Newsletter Filter" in line:
            current_addresses = addresses.setdefault('found', [])
            continue

        if "from" in line and current_addresses != None:
            line = line.split('"')

            if (len(line) > 4) and (line[1] == "from"):
                current_addresses.append(line[3])

                continue

#save the count for later
addr_num = 0
addr_num = len(addresses['found'])

#iterate all files in all sub-directories of INBOX.Newsletter
for root, _,files in os.walk("/home/postman/Mails/.INBOX.Newsletter"):
    #for each file in the current directory
    for emaildir in files:
        #open the file
        with open(os.path.join(root, emaildir), "r") as mail:
            #scan line by line
            for line in mail:
                if "From: " in line:
                    #arm boolean value for adding to list
                    found_sw = False
                    #extract substring from line
                    found = ((line.split('<'))[1].split('>')[0])
                    #compare found address with already existing addresses in dictionary
                    for m_addr in addresses['found']:
                        if m_addr == found:
                            #remember if the address is already in the dictionary
                            found_sw = True
                            break

                    if not found_sw:
                        #if the address is not included in the dictionary put it there
                        current_addresses.append(found)
                    break


# Now print out all the data
#import pprint
#print "addresses found:"
#print "=========================="
#pprint.pprint(addresses['found'])
#print
#print "orig_nmbr_of_addresses:" , addr_num
#print "found_nmbr_of_addresses:", len(addresses['found'])
#print "not_recorded_addresses:", (len(addresses['found']) - (addr_num))

#Compare if the address count has changed
if addr_num == len(addresses['found']):
    #exit the script since no new addresses have been found
    sys.exit
else:
    #copy original sieve file for backup
    import datetime
    from shutil import copyfile
    backupfilename = '.backup_%s.sieve'% datetime.date.today()
    copyfile('dovecot.sieve', backupfilename)

    #edit the existing sieve file and add the new entries
    import fileinput
    #open file for in place editing
    for line in fileinput.input('dovecot.sieve', inplace=1):
        #if the line before the last entry is reached
        if addresses['found'][(addr_num - 2)] in line:
            #print the line
            print line,
            #put new rules before the last line (just to avoid extra handling for last line, since the lines before are rather identical)
            for x in range (addr_num, (len(addresses['found']))):
                print '               address :is "from" "%s",'% addresses['found'][x]
        else:
            #print all other lines
            print line,
DocBrown
  • 21
  • 5