I'm doing some text editing in Java 8 and the files I want to automatically edit often also contain left over formatting infos, which always look like this:
\
- A set text - I know which texts are used and I'll provide that info to regex
- A number (2 to 4 digits)
- Maybe a single blank (which should be replaced too) or nothing
I want to replace all of them with nothing (so: ""
) and even though I could probably read the text char by char to look for the text, I want to try it with the much more "cleaner" looking regex first. But: I've never really worked with regex, apart from copying the occasional code from Stackexchange.
Examples:
\fs14
(font size 14)\ri240
(right indent)\lang1033
(applies a language to a character)
There are also e.g. \par
(new paragraph) or \i
(italic start) and \i0
(italic end) but I can easily replace these with e.g. originalString.replace("\\par","")
. This obiously won't work if I don't know how many and which digits are used, like in the above examples.
I know that the Java code for replacing text using a pattern is:
String newString = originalString.replaceAll(pattern,"");
The needed pattern to address the backslash and the text for the examples above probably looks like this:
(\\\\fs|\\\\ri|\\\\lang)
... but how do I incorporate the number and the blank (if there's one)?