I want to extract all text within HTML-body-Tags with the following Java-code:
Pattern.compile(".*<\\s*body\\s*>(.*?)<\\s*/\\s*body\\s*>.*", Pattern.DOTALL);
..
matcher.find() ? matcher.group(1) : originalText
That works fine for html, but for larger texts which don't contain any html (and with that no body-elements) e.G. larger stack-traces the invocation of matcher.find() takes lots of time.
Does anyone know how what's the cause? And how to make this regular expression even more performant?
Thanks in advance!