How can I do this:
if 'class="*word*"' in html:
print "True."
else:
print "False."
To use * as a wildcard char like in Linux?
How can I do this:
if 'class="*word*"' in html:
print "True."
else:
print "False."
To use * as a wildcard char like in Linux?
If you just want to match Unix filename pattern matching, you can use the dedicated module fnmatch:
import fnmatch
words = ["testing", "wildcard", "in", "python"]
filtered = fnmatch.filter(words, 'p?thon')
# filtered = ['python']
filtered = fnmatch.filter(words, 'w*')
# filtered = ['wildcard']
If you want to do advanced pattern matching, use regular expressions.
You are going to want to look at the re module. This will let you do a regular expression and accomplish the same thing as the * does in the linux command line.
You could use a regular expression from the re module for general purpose pattern matching.
However, if you are working with HTML and trying to match tags and such I would recommend looking into LXML and using its parse
function and cssselect
to get what you want.
from lxml import html
# read in and parse the html
html_doc = parse(filename).getroot()
# get elements that match class "classname"
elements = html_doc.cssselect(.classname)
This doc describes the different CSS Selectors