Is there a way to detect if unnecessary characters are added to strings to bypass spam detection?

Question

I'm building a simple spam classifier and from a cursory look at my dataset, most spams put spaces in between "spammy" words, which I assume is for them to bypass spam classifier. Here's some examples:

c redi t card
mort - gage

I would like to be able to take these and encode them in my dataframe as the correct words:

credit card
mortgage

I'm using Python by the way.

score 0 · Accepted Answer · answered Apr 23 '22 at 15:23

This depends a lot on whether you have a list of all spam words or not.

If you do have a list of spam words and you know that there are always only ADDED spaces (e.g. give me your cred it card in formation) but never MISSING spaces (e.g. give me yourcredit cardinformation), then you could use a simple rule-based approach:

import itertools

spam_words = {"credit card", "rolex"}
spam_words_no_spaces = {"".join(s.split()) for s in spam_words}

sentence = "give me your credit car d inform ation and a rol ex"

tokens = sentence.split()
for length in range(1, len(tokens)):
    for t in set(itertools.combinations(tokens, length)):
        if "".join(t) in spam_words_no_spaces:
            print(t)

Which prints:

> ('rol', 'ex')
> ('credit', 'car', 'd')

So first create a set of all spam words, then for an easier comparison remove all spaces (although you could adjust the method to consider only correct spacing spam words).

Then split the sentence into tokens and finally get all possible unique consequtive subsequences in the token list (including one-word sequences and the whole sentence without whitespaces), then check if they're in the list of spam words.

If you don't have a list of spam words your best chance would probably be to do general whitespace-correction on the data. Check out Optical Character Recognition (OCR) Post Correction which you can find some pretrained models for. Also check out this thread which talks about how to add spaces to spaceless text and even mentions a python package for that. So in theory you could remove all spaces and then try to split it again into meaningful words to increase the chance the spam words are found. Generally your problem (and the oppositve, missing whitespaces) is called word boundary detection, so you might want to check some ressources on that.

Also you should be aware that modern pretrained models such as common transformer models often use sub-token-level embeddings for unknown words so that they can relatively easiely still combine what they learned for a split and a non-split version of a common word.

The thread you provided was very helpful, thank you! It worked flawlessly on my model. — kuntiechan, Apr 26 '22 at 06:05

Is there a way to detect if unnecessary characters are added to strings to bypass spam detection?

1 Answers1