What is the most efficient way to compare two lists and only keep the elements that are in list A but not B for very large datasets?
Example:
words = ['shoe brand', 'car brand', 'smoothies for everyone', ...]
filters = ['brand', ...]
# Matching function
results = ['smoothies for everyone']
There have been somewhat similar questions but I'm currently dealing with 1M+ words and filters, leading to Regular Expressions overloads. I used to do a simple 'filters[i] in words[j]' test with while-loops, but this seems awfully inefficient.