1

Hi I've a string like the following -

name,number,address(line1,city),status,contact(id,phone(number,type),email(id),type),closedate

I need to output the following -

name,number,address.line1,address.city,status,contact.id,contact.phone.number,contact.phone.type,contact.email.id,contact.type,closedate

Is it possible to do it using regex in java. Logic I have thought of is using string manipulation (with substring,recursion etc). Is there a simple way of achieving this? I would prefer a regular expression which works in java. Other suggestions are also welcome. To give you a context The string above is coming as query parameter, I have to find out what all columns I need to select based on that. so all these individual items in the output will have a respective column name in property file.

Thanks Pal

Avinash Raj
  • 172,303
  • 28
  • 230
  • 274

2 Answers2

1
public class Main {


    public static void main(String[] args) {
        ;
        String input ="name,number,address(line1,test(city)),status,contact(id,phone(number,type),email(id),type),closedate";
        List<String> list = new ArrayList<String>(Arrays.asList(input.split(","))); // We need a list for the iterator (or ArrayIterator)
        List<String> result = new Main().parse(list);
        System.out.println(String.join(",", result));
    }

    private List<String> parse(List<String> inputString){
        Iterator<String> it = inputString.iterator();
        ArrayList<String> result = new ArrayList<>();
        while(it.hasNext()){
            String word = it.next();
            if(! word.contains("(")){
                result.add(word);
            } else { // if we come across a "(", start the recursion and parse it till we find the matching ")"
                result.addAll(buildDistributedString(it, word,""));
            }
        }

        return result;
    }

    /*
    * recursivly parse the string
     * @param startword The first word of it (containing the new prefix, the ( and the first word of this prefic
     * @param prefix Concatenation of previous prefixes in the recursion
     */
    private List<String> buildDistributedString(Iterator<String> it, String startword,String prefix){

        ArrayList<String> result = new ArrayList<>();
        String[] splitted = startword.split("\\(");
        prefix += splitted[0]+".";

        if(splitted[1].contains(")")){ //if the '(' is immediately matches, return only this one item
            result.add(prefix+splitted[1].substring(0,splitted[1].length()-1));
            return result;
        } else {
            result.add(prefix+splitted[1]);
        }

        while(it.hasNext()){
            String word = it.next();
            if( word.contains("(")){ // go deeper in the recursion
                List<String> stringList = buildDistributedString(it, word, prefix);
                if(stringList.get(stringList.size()-1).contains(")")){
                    // if multiple ")"'s were found in the same word, go up multiple recursion levels
                    String lastString = stringList.remove(stringList.size()-1);
                    stringList.add(lastString.substring(0,lastString.length() -1));
                    result.addAll(stringList);
                    break;
                }
                result.addAll(stringList);
            } else if(word.contains(")")) { // end this recursion level
                result.add(prefix + word.substring(0,word.length()-1)); // ")" is always the last char
                break;
            } else {
                result.add(prefix+word);
            }
        }
        return result;
    }
}

I wrote a quick parser for this. There probably are some improvements possible, but this should give you an idea. It was just meant to get a working version asap.

Soronbe
  • 906
  • 5
  • 12
  • you should mention that this only works using java 8 – Baby Mar 24 '15 at 03:13
  • Not sure, but I think it would also work with java 7. Even if not, I dont think it would be hard to rewrite the parts that aren't. – Soronbe Mar 24 '15 at 03:18
  • `String.join()` was only introduced in java 8, and OP (or anybody else) might not know about it and assume your code doesn't work – Baby Mar 24 '15 at 03:25
0

Since nested parentheses appear in your string, regular expressions can't do the job. The explanation why is complicated, requiring knowledge in context free grammars. See Can regular expressions be used to match nested patterns?

I've heard this kind of parsing can be done through callbacks, but I believe it doesn't exist in Java.

Parser generators like JavaCC would do the job, but that's a huge overkill for the task you are describing.

I recommend you to look into java.util.Scanner, and you recursively call the parse method whether you see a left paren.

Community
  • 1
  • 1
Ming-Tang
  • 17,410
  • 8
  • 38
  • 76