0

I am trying to chop a text into an Array (or List). For example the String

String str = "so this will be _ITALIC_ and this will be *BOLD* and so on";

shall be split into this:

String[] arr = new String[] {"so this will be ", "_ITALIC_", " and this will be ", "*BOLD*", " and so on"};

i worked out some things with regex like:

the Patterns for finding my matches:

public static final Pattern Italic = Pattern.compile("_(.*?)_");
public static final Pattern Bold = Pattern.compile("\\*(.*?)\\*");
public static final Pattern Strike = Pattern.compile("~(.*?)~");

i also found out how to split the text with my patterns:

// more to be added
Pattern ptn = Pattern.compile(Italic + "|" + Bold + "|" + Strike);
String[] parts = ptn.split(input);

which results in:

"so this will be "
" and this will be "

but i am not able to find a way, without loosing the information of the pattern.

Why i do this? i need to transform plain text to formatted text using javafx.scene.text.TextFlow therefore i need to find the chunks and create javafx.scene.text.Text and apply the formating.

lumo
  • 790
  • 1
  • 8
  • 23
  • It feels like you should really be using a *Markdown Tokenizer*. Or you know, a *Markdown Renderer* directly, if that's where you're going. – ccjmne Jun 06 '18 at 12:53
  • You cannot do this with `split`. Java only supports fixed lenght lookahead/lookbehind and in your case you need to check the whole string to determine, if `*` is a "opening" `*` and you need to split before or a "closing" `*` and you need to split after... – fabian Jun 06 '18 at 13:01
  • If you really want to do it yourself, and your rules are as simple as you described, then `(?=\b_|\*\b)|(?<=_\b|\b\*)` would work for you. I can't explain why, though, 'cause your question just got closed. – ccjmne Jun 06 '18 at 13:01

0 Answers0