0

I have a String like

value 1, value 2, " value 3," value 4, value 5 " ", value 6

I want to split this by comma and ignoring commas found in an expression enclosed by multiple double quotes

My desired output should be

value 1

value 2

" value 3," value 4, value 5 " "

value 6

I tried this Splitting on comma outside quotes but it doesn't work

Thanks in advance........Elsayed

Community
  • 1
  • 1
Elsayed
  • 2,712
  • 7
  • 28
  • 41
  • 1
    Try a CSV parser library such as one of the ones listed [here](https://stackoverflow.com/questions/22137343/java-csv-parser-comparisons). – cxw Dec 07 '16 at 17:39
  • 1
    As cxw says: do not re-invent the wheel. Believe us, building your **own** csv parser and get that **correct** is a **major** piece of work. – GhostCat Dec 07 '16 at 17:45
  • Just looking at that input, it is not clear what is your desired output. Please specify. – Patrick Parker Dec 07 '16 at 17:57
  • 1
    The comma after 4 isn't between double quotes. – Eric Duminil Dec 07 '16 at 20:13
  • Is `value 1, value 2, " " value 3,value 4 ", value 5 ", value 6` also a possible input string? (If not, the solution would be simple.) – Armali May 29 '18 at 06:28

2 Answers2

0

I don't know how to use regex to solve it.

Is the double quotes included now? I haven't tried this code yet.

public static List<String> splitByComma(String text) {
    ArrayList<String> ret = new ArrayList<>();
    char[] chars = text.toCharArray();
    boolean inQuote = false;
    StringBuilder tmp = new StringBuilder();
    for (char ch : chars) {
        if (ch == ',') {
            if (inQuote) tmp.append(ch);
            else {
                ret.add(tmp.toString());
                tmp.setLength(0);
            }
        } else if (ch == '"') {
            tmp.append(ch); // I just add this code
            inQuote = !inQuote;
        } else tmp.append(ch);
    }
    ret.add(tmp.toString());
    return ret;
}

Please tell me if my code has any problem.

EmOwen
  • 104
  • 6
0

Well first I would recommend to escape inner double quotes, e. g. value 1, value 2, " value 3,\" value 4, value 5 \" ", value 6. With this sort of syntax a method I use for this purpose is below. It is a little bit more complex than the first proposal, because it ignores blanks and line breaks between a comma and the next element in the list.

public static String[] splitSet(String inStr, char delimiter) {
    if (inStr == null)
        return null;
    if (inStr.isEmpty())
        return new String[]{};
    /*
     * add an empty element here and remove it at the end to simplify
     * algorithm
     */
    String delimiterStr = String.valueOf(delimiter);
    String parseStr = inStr + delimiterStr + " ";
    /*
     * prepare parsing.
     */
    Vector<String> list = new Vector<>();
    String element = "";
    int lc = 0;
    char b = ' ';
    char c;
    boolean inBetweenQuotes = false;
    /*
     * parsing loop.
     */
    while (lc < parseStr.length()) {
        c = parseStr.charAt(lc);
        /*
         * add current entry and all following empty entries to list vector.
         * Ignore space and new line characters following the delimiter.
         */
        if ((c == delimiter) && !inBetweenQuotes) {
            // flag to avoid adding empty elements for delimiter being blank
            // or new line
            boolean added = false;
            while ((lc < parseStr.length())
                    && ((c == delimiter) || (c == ' ') || (c == '\n'))) {
                if ((c == delimiter)
                        && !(added && ((c == ' ') || (c == '\n')))) {
                    list.add((String) UFormatter.parseElement(element,
                            DataType.STRING, delimiterStr));
                    element = "";
                    added = true;
                }
                lc++;
                if (lc < parseStr.length())
                    c = parseStr.charAt(lc);
                if (lc > 0)
                    b = parseStr.charAt(lc - 1);
            }
        }
        /*
         * add character to tmpList. Close String literal or Vector literal
         */
        else {
            element = element + c;
            // toggle inBetweenQuotes at not escaped '"'
            if ((c == '"') && (b != '\\'))
                inBetweenQuotes = !inBetweenQuotes;
            lc++;
            b = c;
        }
    }
    if (!element.isEmpty() && inBetweenQuotes)
        list.add(element.substring(0, element.length() - 1) + "\"");
    else if (!element.isEmpty())
        list.add(element.substring(0, element.length() - 1));
    // put Vector to array.
    String[] ret = new String[list.size()];
    for (int i = 0; i < list.size(); i++)
        ret[i] = list.elementAt(i);
    return ret;
}
Martin G
  • 229
  • 1
  • 6