Let's say I have a string with an xml many occurences of <tagA>
:
String example = " (...) some xml here (...)
<tagA>283940</tagA>
(...) some xml here (...)
<tagA>& 9940</tagA>
<tagA>- 99440</tagA>
<tagA>< 99440</tagA>
<tagA>99440</tagA>
(...) more xml here (...) "
The content should contain only digits, but sometimes it has a random character followed by a whitespace and the the digits. I want to remove the unwanted character and the whitespace. How to do that?
So far I know I should be looking for a regex "<tagA>. [0-9]*<\/tagA>"
but I am stuck here.
I want to replace the characters because among those characters there are "&", ">", "<" signs which make the xml invalid (which prevents me from treating this as an XML).