As suggested in the comments and other answers, the in operator may be used to check if a string is a substring of another string. For the example data in the question, using in
is the simplest and fastest way to get the desired result.
If the requirement is to match '/item/page/cat-dog' but not '/item/page/catapult' - that is only match the word 'cat', not just the sequence c-a-t, then a regular expression may be used to do the matching.
The pattern to match a single word is '\bfoo\b'
where '\b'
marks a word boundary.
The alternation operator '|'
is used to match one pattern or another, for example 'foo|bar'
matches 'foo' or 'bar'.
Construct a pattern that matches the words in keywords
; call re.escape on each keyword in case they contain characters that the regex engine might interpret as metacharacters.
>>> pattern = r'|'.join(r'\b{}\b'.format(re.escape(keyword)) for keyword in keywords)
>>> pattern
'\\bcat\\b|\\bplanet\\b'
Compile the pattern into a regular expression object.
>>> rx = re.compile(pattern)
Find the matches: using filter is elegant:
>>> matches = list(filter(rx.search, valid))
>>> matches
['/item/page/cat-dog', '/item/page/animal-planet']
But it's common to use a list comprehension:
>>> matches = [word for word in valid if rx.search(word)]
>>> matches
['/item/page/cat-dog', '/item/page/animal-planet']