Let's say I have a list of keywords:
keywords = ["history terms","history words","history vocab","history words terms","history vocab words","science list","science terms vocab","math terms words vocab"]
And a list of main terms:
`main_terms = ["terms","words","vocab","list"]`
UPDATED to more clearly state the problem:
The script I'm making is to remove near-duplicates from a long list of keywords. I've managed to remove misspellings and slight variants (ex. "hitsory terms", "history term").
My problem is that I have multiple terms that I'm looking for in this list of keywords, but after I've found one of these terms in a keyword (ex. "history terms") all keywords that are identical except with a different term or combination of terms (ex. "history vocab", "history words", "history words terms", etc.) should be considered duplicates.
- It is OK to have multiple terms in the keyword (ex. "math terms words vocab") as long as there is not a keyword that is identical save for having a lower number of the terms (ex. "math terms words" or ideally a single term like "math vocab").