I am trying to solve a problem to remove HTML tags from a string. I realize that regular expressions are a better solution, but I'd like to figure out what is going wrong here.
The idea is to assume that we monitor being in a tag using 'tag', with it's value being modified by comparing the value of each char.
The problem is, the value of tag is never changed:
def remove_tag(s):
tag = True
for c in s:
print "c = %s" % c
if (c == '<'):
print 'start_tag'
tag == True
print tag
elif (c == '>'):
print 'end tag'
tag == False
print tag
Running:
remove_tag("<h1>Title</h1>")
Produces:
c = <
start_tag
True
c = h
c = 1
c = >
end tag
True
c = T
c = i
c = t
c = l
c = e
c = <
start_tag
True
c = /
c = h
c = 1
c = >
end tag
True
None
I am baffled as to why 'end tag' is printed but the value 'False' does not get assigned to tag.