0

I have got the following code in Python (in PyCharm Community Edition):

def defer_tags(sentence):

    for letter in sentence:
        print(letter)
        if letter == '<':
            end_tag = sentence.find('>')
            sentence = sentence[end_tag+1:]
            print(sentence)

defer_tags("<h1>Hello")

It produced the following output:

current letter =  <
new_sentence =  Hello
current letter =  h
current letter =  1
current letter =  >
current letter =  H
current letter =  e
current letter =  l
current letter =  l
current letter =  o

Why does loop (letter) navigate through the entire string (sentence) even though the value of sentence has changed inside the loop ?

I am printing out the value of sentence after the change but it is not getting reflected in the loop iterations.

martineau
  • 119,623
  • 25
  • 170
  • 301

2 Answers2

0

To be explicit, try using beautiful soup following way:

>>> from BeautifulSoup import BeautifulSoup
>>> soup = BeautifulSoup('<h1>Hello<h1>')
>>> soup.text
u'Hello'
andromeda
  • 1
  • 3
-1

Better way to catch phrases from tags is just simple to use re.

import re
def defer_tags(sentence):
    return re.findall(r'>(.+)<', sentence)

defer_tags('<h1>Hello<h1>')
> ['Hello']
defer_tags('<h1>Hello</h1><h2>Ahoy</h2>')
> ['Hello', 'Ahoy']

This will work if the tags are full. Ie <h2>Hello</h2> of <h1>Ahoy</h1> <h2>XX</h2> etc.

Nf4r
  • 1,390
  • 1
  • 7
  • 9