I've been trying around to get this working with regular expressions but I just keep failing, so maybe someone more experienced with this can help?
How can I render a string close to the way any webbrowser renders a html string? Example HTML:
<html>
Hel
lo
how
are you
</html>
Is rendered:
Hel lo how are you
I want it to be
Hello how are you
So the difference to html is that a newline without explicit spaces is just removed. In java this string would look like this:
\tHel\nlo \n how\n are you
My current solution:
// remove linebreaks and tabs and any leading or trailing whitespace
// this is necessary to avoid converting \t or \n to a space
script = script.replaceAll("\\s+\n\\s+", "");
script = script.replaceAll("\\s+\t\\s+", "");
// remove any length of whitespace and replace it with one
script = script.replaceAll("\\s+", " ");
// rewmove leading and trailing whitespaces
script = script.trim();
Has only one problem: If I have a line with a trailing space followed by a newline and some more text, the trailing space will be removed:
Hello \nhow are you?
will be reduced to
Hellohow are you
So, using underscore (_) as space marker the following should be true:
_ = _
__ = _
\t\n_ = _
_\t\n = _
\t_\n = _
_\t_\n_ = _
\n = // nothing
\t = // nothing
\t\n = // nothing
What combination of replaceAll(regex, string) would I need to use?