i have a string in which a word is like this "#<i><b>when</b></i>"
. i want only word without any tag. when in striped "#"
word became "<i><b>when</b></i>"
. but when i striped "<i>"
word became like "b>when"<b>when</b>"?
Asked
Active
Viewed 113 times
0

vpit3833
- 7,817
- 2
- 25
- 25

Lalit Chattar
- 704
- 4
- 10
- 24
-
See: http://stackoverflow.com/questions/37486/filter-out-html-tags-and-resolve-entities-in-python – GreenMatt Nov 20 '10 at 05:52
-
And http://stackoverflow.com/questions/2295942/pythons-equivalent-to-phps-strip-tags – mpen Nov 20 '10 at 06:02
2 Answers
1
Slice it.
>>> '#<i><b>when</b></i>'[4:-4]
'<b>when</b>'

Ignacio Vazquez-Abrams
- 776,304
- 153
- 1,341
- 1,358
-
thanks but i want to keep record for tag that i removed... is any method or function which remove only tag – Lalit Chattar Nov 20 '10 at 05:37
-
@Gautam: Slicing a string doesn't modify the string. If you assign the result of the slice to a variable, you'll keep the original tag and you'll have the word too. – Zeke Nov 20 '10 at 05:47
-
if s = "#when", then s2 = s[:5] + s[8:19] + s[23:] should give what (I think) you want for this example. – GreenMatt Nov 20 '10 at 05:50
-
I see 2 problems with this solution... (1) he said *without any tag*... I assume that includes ``. And (2) It only works on this particular example.. although, of course, who knows what he's trying to do. – mpen Nov 20 '10 at 06:04
0
Use regular expressions.
>>> import re
>>> s='#<i><b>when</b></i>'
>>> wordPattern = re.compile(r'\>(\w+)\<')
>>> wordPattern.search(s).groups()
('when',)

Zeke
- 1,974
- 18
- 33
-
@Alex: Yeah, I read http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454 and I only suggest it for this specific instance, not for general HTML parsing. – Zeke Nov 20 '10 at 06:03
-
-
-