I have strings that are of the form below:
<p>The is a string.</p>
<em>This is another string.</em>
They are read in from a text file one line at a time. I want to separate these into words. For that I am just splitting the string using split()
.
Now I have a set of words but the first word will be <p>The
rather than The
. Same for the other words that have <>
next to them. I want to remove the <..>
from the words.
I'd like to do this in one line. What I mean is I want to pass as a parameter something of the form <*>
like I would on the command line. I was thinking of using the replace()
function to try to do this, but I am not sure how the replace()
function parameter would look like.
For example, how could I change <..>
below in a way that it will mean that I want to include anything that is between <
and >
:
x = x.replace("<..>", "")
This isn't a string
.`. Wouldn't your example split `isn't` into `isn'` and `t`? And maybe you'll come up with a correction. Then what? I give you another counterexample and you'll correct that too? I wanted a solution to something specific. Your answer assumes more than what was asked. – Mars Jul 19 '14 at 22:16