how do i extract everythin that is not an html tag from a partial html text?
That is, if I have something of the type:
<div>Hello</div><h3><div>world</div></h3>
I want to extract ['Hello','world']
I thought about the Regex:
>[a-zA-Z0-9]+<
but it will not include special characters and chinese or hebrew characters, which I need