2

I've a long text in Java, which contains at least one markdown image syntax. If there're N markdown image syntax, I will need to split the string into N+1 substrings and store them in an array of String, call texts. For example, I've the following text

Hello world!
![Alt text](/1/2/3.jpg)
Hello Stack Overflow!

Then Hello world!\n will be stored in position 0 and \nHello Stack Overflow! will be stored in position 1. For my question, we can assume that

  • The Alt text part contains only character A-Z, a-z and blank space.
  • The URL part contains only digits 0-9 and slash /. Its extension will only be .jpg. Other extension will not exist.

My question is how to split the text ? Do we need a java regular expression, such as *![*](*.jpg) ?

Mincong Huang
  • 5,284
  • 8
  • 39
  • 62
  • A regex, sure - why not. Is your regex notation different than the standard one? – Jongware Apr 03 '16 at 22:27
  • No, my regex notation supposes to be same as the standard one. If there's error, it's my fault. (I don't know much about regular expression) – Mincong Huang Apr 03 '16 at 22:29

3 Answers3

11

Try this (ready to copy-paste):

"!\\[[^\\]]+\\]\\([^)]+\\)"

See here for info about how to get the matches.

"Untainted" version: !\[[^\]]+\]\([^)]+\)

Explanation

  • ! literally !
  • \[ escaped [
  • [^\]]+ as many not ]s as possible
  • \]\( escaped ](
  • [^)]+ as many not )s as possible
  • \) escaped )
Community
  • 1
  • 1
Laurel
  • 5,965
  • 14
  • 31
  • 57
0

This is my way

public class Test {

public static void main(String[] args) {
    // TODO Auto-generated method stub
     List<String> allMatches = new ArrayList<String>();
     String str = "}```![imageName](/sword?SwordControllerName=KMFileDownloadController&id=c60b6c5a8d9b46baa1dc266910db462d \"imageName\")#### JSON data";
     Matcher m = Pattern.compile("\\[.*\\]\\((.*)\\)").matcher(str);
     while (m.find()) {
         allMatches.add(m.group(1).split(" ")[0]);
     }
     //print "/sword?SwordControllerName=KMFileDownloadController&id=c60b6c5a8d9b46baa1dc266910db462d"
     for(String s:allMatches){
         System.out.println(s);
     }
  }
}

https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Regular_Expressions

Ying Yi
  • 784
  • 4
  • 16
0
!\[[^\]]*?\]\([^)]+\)

That way Alt Text can stay empty - though it makes no sense

Juergen Schulze
  • 1,515
  • 21
  • 29