2

How can I do this:

if 'class="*word*"' in html:
    print "True."
else:
    print "False."

To use * as a wildcard char like in Linux?

SiHa
  • 7,830
  • 13
  • 34
  • 43
Bytez
  • 63
  • 1
  • 1
  • 4
  • 2
    And if you follow that trail, you'll realize that [you shouldn't be parsing HTML that way anyway](https://stackoverflow.com/q/1732348/102441) – Eric Aug 24 '17 at 15:09

3 Answers3

4

If you just want to match Unix filename pattern matching, you can use the dedicated module fnmatch:

import fnmatch
words = ["testing", "wildcard", "in", "python"]
filtered = fnmatch.filter(words, 'p?thon')
# filtered = ['python']
filtered = fnmatch.filter(words, 'w*')
# filtered = ['wildcard']

If you want to do advanced pattern matching, use regular expressions.

filaton
  • 2,257
  • 17
  • 27
1

You are going to want to look at the re module. This will let you do a regular expression and accomplish the same thing as the * does in the linux command line.

galamdring
  • 303
  • 1
  • 7
0

You could use a regular expression from the re module for general purpose pattern matching.

However, if you are working with HTML and trying to match tags and such I would recommend looking into LXML and using its parse function and cssselect to get what you want.

from lxml import html

# read in and parse the html
html_doc = parse(filename).getroot()

# get elements that match class "classname"
elements = html_doc.cssselect(.classname)

This doc describes the different CSS Selectors

Austin Yates
  • 108
  • 9