I need a regex to parse a string, which needs to be split by commas... the commas to be used as the split can only match commas not inside quotes...
should be 3: 3 (is right)
should be 3: 14 (is wrong, counted commas inside quotes)
should be 24: 12 (is wrong)
should be 24: 24. (is right)
For the following results test case:
String line ="com.day.image;uses:=\"javax.imageio.stream,javax.imageio.spi,javax.imageio.plugins.jpeg,org.slf4j,javax.imageio.metadata,javax.imageio,com.day.imageio.plugins,com.day.image.font\",com.day.imageio.plugins;uses:=\"javax.imageio,javax.imageio.metadata,javax.imageio.stream,javax.imageio.spi,org.w3c.dom\",com.day.image.font;uses:=\"com.day.image\"";
String[] results1 = line.split("\",");
String[] results2 = line.split(",");
System.out.println("should be 3: "+ results1.length);
System.out.println("should be 3: "+ results2.length);
line = "com.day.cq.commons,com.day.cq.commons.inherit,com.day.cq.wcm.api,com.day.cq.wcm.api.components,com.day.cq.wcm.api.designer,com.day.cq.wcm.commons,com.day.cq.wcm.tags,com.day.cq.widget,javax.servlet,javax.servlet.http,javax.servlet.jsp;version=\"2.1\",javax.servlet.jsp.el;version=\"2.1\",javax.servlet.jsp.jstl.core,javax.servlet.jsp.jstl.fmt,javax.servlet.jsp.tagext;version=\"2.1\",org.apache.commons.lang;version=\"2.4\",org.apache.sling.api;version=\"2.1\",org.apache.sling.api.request;version=\"2.1\",org.apache.sling.api.resource;version=\"2.1\",org.apache.sling.api.scripting;version=\"2.1\",org.apache.sling.api.servlets;version=\"2.1\",org.apache.sling.scripting.jsp.taglib;version=\"2.0\",org.apache.sling.scripting.jsp.util;version=\"2.0\",org.slf4j;version=\"1.5\"";
results1 = line.split("\",");
results2 = line.split(",");
System.out.println("should be 24: "+ results1.length);
System.out.println("should be 24: "+ results2.length);
the output is,
should be 3: 3
should be 3: 14
should be 24: 12
should be 24: 24
UPDATED
I understand very well what I need, but I didn't know how to do it.. my explanation what I was trying to accomplish wasn't the best. A bad defined problem, hardly would lead to solutions. One of my faculties is to simply complex scenarios, obviously tonight wasn't for me.
After searching I refine my question again, Google search term: "How do I match a character outside of quotes?"
Now is well know Google first results should be the most probably you look for, if you ASK the RIGHT question too Google ;).
Firsts result, Regex to pick commas outside of quotes
The regular expression would be this: (,)(?=(?:[^"']|["|'][^"']")$).
tested and worked..
Finally I assume there a difference between, programming skills, understanding skills, definitely they are not carried together by many programmers out there.. I asked in several places, and most people say that it was not possible... apparently it is.
Thanks for your time, and sorry maybe the rush to get the help.
This site is GREAT! :)
UPDATE2
This regex (,)(?=(?:[^"']|["|'][^"']")$). is giving me problem of StackOverFlow..!!
at java.util.regex.Pattern$GroupHead.match(Unknown Source)
at java.util.regex.Pattern$Loop.match(Unknown Source)
at java.util.regex.Pattern$GroupTail.match(Unknown Source)
at java.util.regex.Pattern$BranchConn.match(Unknown Source)
at java.util.regex.Pattern$CharProperty.match(Unknown Source)
at java.util.regex.Pattern$Branch.match(Unknown Source)
at java.util.regex.Pattern$GroupHead.match(Unknown Source)
at java.util.regex.Pattern$Loop.match(Unknown Source)
at java.util.regex.Pattern$GroupTail.match(Unknown Source)
at java.util.regex.Pattern$BranchConn.match(Unknown Source)
at java.util.regex.Pattern$CharProperty.match(Unknown Source)
at java.util.regex.Pattern$Branch.match(Unknown Source)
at java.util.regex.Pattern$GroupHead.match(Unknown Source)
at java.util.regex.Pattern$Loop.match(Unknown Source)
at java.util.regex.Pattern$GroupTail.match(Unknown Source)
at java.util.regex.Pattern$BranchConn.match(Unknown Source)
at java.util.regex.Pattern$CharProperty.match(Unknown Source)
at java.util.regex.Pattern$Branch.match(Unknown Source)
at java.util.regex.Pattern$GroupHead.match(Unknown Source)
at java.util.regex.Pattern$Loop.match(Unknown Source)
at java.util.regex.Pattern$GroupTail.match(Unknown Source)
at java.util.regex.Pattern$BranchConn.match(Unknown Source)
at java.util.regex.Pattern$CharProperty.match(Unknown Source)
at java.util.regex.Pattern$Branch.match(Unknown Source)
at java.util.regex.Pattern$GroupHead.match(Unknown Source)
at java.util.regex.Pattern$Loop.match(Unknown Source)
Apparently it works fine for some inputs but not others! Or is the Java Regex engine buggy?
UPDATE3
This Regex do not overflows and works(java escaped): "(,)(?=(?:[^\"]|\"[^\"]\")$)"