1

I have this String:

 String string="NNP,PERSON,true,?,IN,O,false,pobj,NNP,ORGANIZATION,true,?,p";

How can I do to split it into an array every 4 commas? I would like something like this:

     String[] a=string.split("d{4}");
     a[0]="NNP,PERSON,true,?";
     a[1]="IN,O,false,pobj";
     a[2]="NNP,ORGANIZATION,true,?";
     a[3]="p";
Ishtar
  • 11,542
  • 1
  • 25
  • 31
Enzo
  • 597
  • 1
  • 8
  • 22
  • You could either use regex or split using "," and then put arrays back together –  Apr 13 '14 at 15:32
  • 1
    Regexes are fancy, but can turn out to be cryptic so make sure you document what it does, because it's a pain to figure out what it does, especially if you or even someone else looks at the code after some time. Also, processing (complex) regex will probably take up more time than splitting and grouping it back together, like @BrendanRius recommended. – Marko Gresak Apr 13 '14 at 15:53

4 Answers4

2

Keep it simple. No need to use regex. Simply count the number of commas. when four commas are found then use String.substring() to find out the value.

Finally store the printed values in ArrayList<String>.

    String string = "NNP,PERSON,true,?,IN,O,false,pobj,NNP,ORGANIZATION,true,?,p";

    int count = 0;
    int beginIndex = 0;
    int endIndex = 0;
    for (char ch : string.toCharArray()) {
        if (ch == ',') {
            count++;
        }
        if (count == 4) {
            System.out.println(string.substring(beginIndex + 1, endIndex));
            beginIndex = endIndex;
            count = 0;
        }
        endIndex++;
    }

    if (beginIndex < endIndex) {
        System.out.println(string.substring(beginIndex + 1, endIndex));
    }

output:

    NP,PERSON,true,?
    IN,O,false,pobj
    NNP,ORGANIZATION,true,?
    p
Braj
  • 46,415
  • 5
  • 60
  • 76
1

If you really have to use split you can use something like

String[] array = string.split("(?<=\\G[^,]{1,100},[^,]{1,100},[^,]{1,100},[^,]{1,100}),");

Explanation if idea in my previous answer on similar but simpler topic

Demo:

String string = "NNP,PERSON,true,?,IN,O,false,pobj,NNP,ORGANIZATION,true,?,p";
String[] array = string.split("(?<=\\G[^,]{1,100},[^,]{1,100},[^,]{1,100},[^,]{1,100}),");
for (String s : array)
    System.out.println(s);

output:

NNP,PERSON,true,?
IN,O,false,pobj
NNP,ORGANIZATION,true,?
p

But if there is any chance that you don't have to use split but you still want to use regex then I encourage you to use Pattern and Matcher classes to create simple regex which can find parts you are interested in, not complicated regex to find parts you want to get rid of. I mean something like

  1. any xx,xxx,xxx,xxx part where x is not ,
  2. any xx or xx,xx or xxx,xxx,xxx parts if they are placed at the end of string (to catch rest of data unmatched by regex from point 1.)

So

Pattern p = Pattern.compile("[^,]+(,[^,]+){3}|[^,]+(,[^,]+){0,2}$");

should do the trick.


Another solution and probably the fastest (and quite easy to write) would be creating your own parser which will iterate over all characters from your string, store them in some buffer, calculate how many , already occurred and if number is multiplication of 4 clear buffer and write its contend to array (or better dynamic collection like list). Such parser can look like

public static List<String> parse(String s){
    List<String> tokens = new ArrayList<>();
    StringBuilder sb = new StringBuilder();
    int commaCounter = 0;

    for (char ch: s.toCharArray()){
        if (ch==',' && ++commaCounter == 4){
            tokens.add(sb.toString());
            sb.delete(0, sb.length());
            commaCounter = 0;
        }else{
            sb.append(ch);
        }
    }
    if (sb.length()>0)
        tokens.add(sb.toString());

    return tokens;
}

You can later convert List to array if you need but I would stay with List.

Community
  • 1
  • 1
Pshemo
  • 122,468
  • 25
  • 185
  • 269
0

Edited, Try this:

String str = "NNP,PERSON,true,?,IN,O,false,pobj,NNP,ORGANIZATION,true,?,p";
String[] arr = str.split(",");
ArrayList<String> result = new ArrayList<String>();
String s = arr[0] + ",";
int len = arr.length - (arr.length /4) * 4;
int i;
for (i = 1; i <= arr.length-len; i++) {
    if (i%4 == 0) {
        result.add(s.substring(0, s.length()-1));
        s = arr[i] + ",";
    }
    else
        s += arr[i] + ",";
}
s = "";
while (i <= arr.length-1) {
    s += arr[i] + ",";
    i++;
}
s += arr[arr.length-1];
result.add(s);

output:

    NP,PERSON,true,?
    IN,O,false,pobj
    NNP,ORGANIZATION,true,?
    p
Mohsen Kamrani
  • 7,177
  • 5
  • 42
  • 66
0
StringTokenizer tizer = new StringTokenizer (string,",");
int count = tizer.countTokens ()/4;
int overFlowCount = tizer.countTokens % 4;
String [] a;
if(overflowCount > 0)
    a = new String[count +1];
else
    a = new String[count];
int x = 0;
for (; x <count; x++){
    a[x]= tizer.nextToken() + "," + tizer.nextToken() + "," + tizer.nextToken() + "," + tizer.nextToken();
}
if(overflowCount > 0)
while(tizer.hasMoreTokens()){
    a[x+1] = a[x+1] + tizer.nextToken() + ",";
}
Abu Sulaiman
  • 1,477
  • 2
  • 18
  • 32