Any way to filter email dynamically by taking 'from' and matching it with database ? (Using procmail or Virtualmin or Webmin)

Question

I basically want to check the incoming 'From' in the email received and then either

Keep it and make it deliver to the intended mailbox if the email matches a Specified MySQL/PostgreSQL Database User (eg. select email from users where exists ('from email address') )
If the 'From' address is blank or it is not found in the database, the email should be discarded

Any way I can achieve this before the e-mail is delivered to the intended mailbox?

I am using Procmail + Virtualmin + Webmin + PostgreSQL

PS: I want to apply this filter not to the wole server but to some specified mailboxes/users (i'm assuming 1 user = 1 mailbox here)

Sometimes people ask a simple innocent question about `From:` and only then realize that the actual ground truth about who really sent a message can be quite a bit more complex in SMTP. For example, many mailing lists display the original `From:` header from when the message was originally submitted to the list, but the actual `Sender:` is the mailing list. It's not clear from your question what you are hoping to accomplish, so I'll just point out that the answer could be quite different if you need to distinguish a number of corner cases like this. — tripleee, May 01 '20 at 07:57
I notice that you edited to add tags for both MySQL and Postrgres. Does that mean you are satisfied with any solution for one, or the other? In other words, do you not have an existing database which this solution needs to interface to? Please [edit] the question itself to clarify your requirements. — tripleee, May 03 '20 at 06:10
Also, does "intended mailbox" mean whichever address Procmail is already in the process of delivering to, or somehow the output from the SQL query? The latter will be somewhat more complex, and probably needs to be spelled out in some more detail. — tripleee, May 03 '20 at 06:36
Actually what I want to do is check if the e-mail is currently in the existing Database (Portgres or MySQL have same queries) and then block all emails if the output of the search is null (i.e the email is not existing in the database. — gb.123, May 03 '20 at 17:58
I am currently trying to make your suggestion work and test. I believe you are in the right direction but I myself need to read further on how to configure and use procmail. — gb.123, May 03 '20 at 17:59
PS: I am using Postgres SQL btw, not MySQL, but both commands should be similar. — gb.123, May 03 '20 at 18:06
The SQL should be similar but the details of the two clients are different in what options they accept etc. The `psql` command in my answer uses variable interpolation in a way which is probably not portable to other SQL clients, for example. — tripleee, May 04 '20 at 04:42
you are correct. I have removed Mysql from my question since I need psql only. — gb.123, May 05 '20 at 22:48

tripleee · Answer 1 · 2020-05-11T08:44:27.507

Procmail can easily run an external command in a condition and react to its exit status. How exactly to make your particular SQL client set its exit code will depend on which one you are using; perhaps its man page will reveal an option to make it exit with an error when a query produces an empty result set, for example? Or else write a shell wrapper to look for empty output.

A complication is that Procmail (or rather, the companion utility formail) can easily extract a string from e.g. the From: header; but you want to reduce this to just the email terminus. This is a common enough task that it's easy to find a canned solution - generate a reply and then extract the To: address (sic!) from that.

FROM=`formail -rtzxTo:`

:0
* FROM ?? ^(one@example\.com|two@site\.example\.net|third@example\.org)$
{
  :0
  * ? yoursql --no-headers --fail-if-empty-result \
      --batch --query databasename \
      --eval "select yada yada where address = '$FROM'"
  { }
  :0E
  /dev/null
}

The first condition examines the variable and succeeds if it contains one of the addresses (my original answer simply had the regex . which matches if the string contains at least one character, any character; I'm not convinced this is actually necessary or useful; there should be no way for From: to be empty). If it is true, Procmail enters the braces; if not, they will be skipped.

The first recipe inside the braces runs an external command and examines its exit code. I'm imagining your SQL client is called yoursql and that it has options to turn off human-friendly formatting (table headers etc) and for running a query directly from the command line on a specific database. We use double quotes so that the shell will interpolate the variable FROM before running this command (maybe there is a safer way to pass string variables which might contain SQL injection attempts with something like --variable from="$FROM" and then use that variable in the query? See below.)

If there is no option to directly set the exit code, but you can make sure standard output is completely empty in the case of no result, piping the command to grep -q . will produce the correct exit code. In a more complex case, maybe write a simple Awk script to identify an empty result set and set its exit status accordingly.

Scraping together information from https://www.postgresql.org/docs/current/app-psql.html, How do you use script variables in psql?, Making an empty output from psql, and from your question, I end up with the following attempt to implement this in psql; but as I don't have a Postgres instance to test with, or any information about your database schema, this is still approximate at best.

* ? psql --no-align --tuples-only --quiet \
    --dbname=databasename --username=something --no-password \
    --variable=from="$FROM" \
    --command="select email from users where email = :'from'" \
    | grep -q .

(We still can't use single quotes around the SQL query, to completely protect it from the shell, because Postgres insists on single quotes around :'from', and the shell offers no facility for embedding literal single quotes inside single quotes.)

The surrounding Procmail code should be reasonably self-explanatory, but here goes anyway. In the first recipe inside the braces, if the condition is true, the empty braces in its action line are a no-op; the E flag on the next recipe is a condition which is true only if any of the conditions on the previous recipe failed. This is a common idiom to avoid having to use a lot of negations; perhaps look up "de Morgan's law". The net result is that we discard the message by delivering it to /dev/null if either condition in the first recipe failed; and otherwise, we simply pass it through, and Procmail will eventually deliver it to its default destination.

The recipe was refactored in response to updates to your question; perhaps now it would make more sense to just negate the exit code from psql with a ! in front:

FROM=`formail -rtzxTo:`

:0
* FROM ?? ^(one@example\.com|two@site\.example\.net|third@example\.org)$
* ! ? psql --no-align --tuples-only --quiet \
      --dbname=databasename --username=something --no-password \
      --variable=from="$FROM" \
      --command="select email from users where email = :'from'" \
    | grep -q .
/dev/null

Tangentially, perhaps notice how Procmail's syntax exploits the fact that a leading ? or a doubled ?? are not valid in regular expressions. So the parser can unambiguously tell that these conditions are not regular expressions; they compare a variable to the regex after ??, or examine the exit status of an external command, respectively. There are a few other special conditions like this in Procmail; arguably, all of them are rather obscure.

Newcomers to shell scripting should also notice that each shell command pipeline has two distinct results: whatever is being printed on standard output, and, completely separate from that, an exit code which reveals whether or not the command completed successfully. (Conventionally, a zero exit status signals success, and anything else is an error. For example, the exit status from grep is 0 if it finds at least one match, 1 if it doesn't, and usually some other nonzero exit code if you passed in an invalid regular expression, or you don't have permission to read the input file, etc.)

For further details, perhaps see also http://www.iki.fi/era/procmail/ which has an old "mini-FAQ" which covers several of the topics here, and a "quick reference" for looking up details of the syntax.

I'm not familiar with Virtualmin but https://docs.virtualmin.com/Webmin/PostgreSQL_Database_Server shows how to set up Postgres and as per https://docs.virtualmin.com/Webmin/Procmail_Mail_Filter I guess you will want to use the option to put this code in an include file.

I'm clearly not an SQL person; I am generalizing here based on very superficial familiarity with `mysql`, `sqlite`, and `psql` which IIRC all have options something like these. This could probably be improved if the question had more details about those parts. — tripleee, May 01 '20 at 06:23
If you need more help with the server configuration parts, probably ask on one of our sister sites for general computing questions; I'd explore [unix.se] and [sf] but check their respective help sections and look for duplicates before asking. — tripleee, May 03 '20 at 08:28
i believe you have given me the right direction. I am still learning how to configure and use procmail. I plan you upvote your answer once I can confirm that it works so that other members also know that its working and use it. :) Thank you once again. — gb.123, May 03 '20 at 18:02
I have another question in this regard. I want this filter to be applicable to a particular e-mail user only? can that be configured ? — gb.123, May 03 '20 at 18:17
It's easy to add or modify conditions. You could change the first condition line to `FROM ?? ^billg@microsoft\.com$` to only match on that specific address. (Remember this is a regex, not just a string.) Equivalently, just match the incoming message's headers, like `* ^From:(.*\<)?billg@microsoft\.com\>`. Actually, I updated the answer just now to show how to do this in response to your question edit. — tripleee, May 04 '20 at 04:45
Oh. I meant the following: eg. I have one@mail.com, two@mail.com & three@mail.com and I only want to apply filter to one@mail.com, then what do I do? But it seems I'll just use TO instead of From in your answer. Right ? (PS: one@mail.com.... and others are the mailboxes on my server, the ones supposed to be receiving the mail. — gb.123, May 04 '20 at 19:11
You can install this in the `.procmailrc` of each of those users. There is no really good way to examine the intended recipient for a message, and there could be more than one. `^TO_` will fire if any one of the recipients is among the _explicit_ recipients of the message, but that won't be true e.g. for mailing lists or `Bcc:` messages; and then it will obviously fire regardless of whether the current message is actually being delivered to someone else (like if two of your users are subscribers of the same mailing list). — tripleee, May 05 '20 at 04:33
... (cont) ... But if you are currently executing `/home/user3/.procmailrc` then you absolutely know that this is a delivery to `user3`. — tripleee, May 05 '20 at 04:34
I placed the file but the mail is not getting filtered. How do I check if Procmail is actaully processing this file or not ? maybe I placed it wrong or maybe the command is giving an error ? any way to test ? — gb.123, May 05 '20 at 22:05
To give you more details : the home directory is /home/user/ & mail directory is /home/user/Maildir. The literal location of the directory is : /home/domain/homes/user/ (I have placed the file under /home/domain/homes/user/.procmail — gb.123, May 05 '20 at 22:33
The first step would be to enable logging, but I don't know how that appears within Virtualmin. For a start, have a look at http://www.iki.fi/era/mail/procmail-debug.html — tripleee, May 06 '20 at 04:08
Ok. I made Procmail work. The mails get delivered to /dev/null and the same is recorded in the log file..... Now after trying your recipe, its not seeming to work and all mails (even new ones are getting delivered) any suggestions on how to diagnose your recipe ? — gb.123, May 06 '20 at 21:35
problem lies in 'where exists (:from)...' this does not seem to work in Postgresql — gb.123, May 06 '20 at 23:21
Well I just copied your SQL code on the assumption that it was working for you. Let me see if I can figure out a working query. — tripleee, May 07 '20 at 05:41
I updated the query now, but I still don't have an environment where I can test this. If you still can't figure it out, maybe post a separate question with a small demo database to test with. — tripleee, May 11 '20 at 08:45

Any way to filter email dynamically by taking 'from' and matching it with database ? (Using procmail or Virtualmin or Webmin)

1 Answers1