I'm slowly fine-tuning my sieve filter. I noticed I was getting a lot of spam in Russian, so I thought I could filter on the presence of Cyrillic in the subject. I thought maybe three consecutive characters would be a good test, and it seems to work pretty well. Here's the line:
elsif header :regex "Subject" [ "[а-яА-Я]{3,}" ]
It's not ideal, because there are plenty of Cyrillic characters outside the А-Я range. Also, I'd like to do the same with CJK characters, and I'm not sure even how to begin with those.
Is it possible in sieve to specify a script as a character class? I've done it before in other regex implementations, but it seems to me that it's handled differently, if at all, by different regex flavours.
Thanks, Ben