0

I have a bunch of html I am trying to deal with. I want to delete the last half tag that I have. Basically I am starting with:

</div></div><div class="_3o-d" id="education

and want to end with:

</div></div>

I tried:

workSection = re.split('<.*?$',workSection)[0]

but this matches the first '<' and leaves me with an empty string. Is there a way to just match the last instance? Or to somehow start from the end?

I am also aware that splitting and then taking the first option may not be the best way of doing this, and am prepared to take a beating for it now.

Chase Roberts
  • 9,082
  • 13
  • 73
  • 131

1 Answers1

1

Just use [^<] instead of the .

>>> re.split('<[^<]*$', '</div></div><div class="_3o-d" id="education')
['</div></div>', '']
John La Rooy
  • 295,403
  • 53
  • 369
  • 502