1

I want to split the values based on the number of occurrences.

  • If the number of occurrences is 4
    e.g. key = A-B-C-D, the answer should be A,B,C,D

  • If the number of occurrences is more than 4
    e.g. key = A-B-C-D-E-F, the answer should be A-B-C,D,E,F

Please find my attempt below:

String key = "A-B-C-D-E-F";
String[] res = key.split("(?<!^[^_]*)_");
System.out.println(Arrays.toString(res));

My output is A-B,C,D,E,F but my expectation is A-B-C,D,E,F

Similarly the number of occurrences varies based on usage. While splitting, I need to get maximum four values.

Please check and let me know about this.

nhahtdh
  • 55,989
  • 15
  • 126
  • 162
Abdul
  • 942
  • 2
  • 14
  • 32
  • `(?<!^[^_]*)_` won't work in Java because Lookbehind don't allow quantifiers. – anubhava Apr 08 '16 at 10:44
  • 1
    @anubhava: [It works though](https://ideone.com/2W3PHE). – Wiktor Stribiżew Apr 08 '16 at 10:45
  • 1
    I'm surprised, even PCRE won't allow that. – anubhava Apr 08 '16 at 10:46
  • 1
    java lookbehind does allow quantifiers, but only if the length of the string is finite afaik (like curly quantifier with upper bound); still i am surprised it works with star. – guido Apr 08 '16 at 10:52
  • 1
    Right I know it works with `{0,999}` but with `*`, totally suprising – anubhava Apr 08 '16 at 10:53
  • 1
    I guess we should ask @nhahtdh when he is online. – Wiktor Stribiżew Apr 08 '16 at 10:53
  • 1
    Even [the reference](http://www.regular-expressions.info/lookaround.html#limitbehind) confirms @anubhava's suspicion... – HamZa Apr 08 '16 at 10:54
  • 1
    After pinging [nhahtdh](http://chat.stackoverflow.com/transcript/message/29834592#29834592) he could point me in the right direction. [This answer seems to explain our confusion...](http://stackoverflow.com/a/1537370) – HamZa Apr 08 '16 at 11:01
  • 4
    @WiktorStribiżew: This is a known issue in Java: https://stackoverflow.com/questions/1536915/regex-look-behind-without-obvious-maximum-length-in-java Java officially does not support variable length look-behind, but due to the way it implements `*` and the way it checks the length of the pattern in the look-behind, some cases are let through. – nhahtdh Apr 08 '16 at 11:01
  • Is any other way to do it – Abdul Apr 08 '16 at 11:03
  • Thanks for your support buddy :) – Abdul Apr 08 '16 at 11:06
  • You cannot use the `split` like that in this case, since the lookbehind will become of unknown width. – Wiktor Stribiżew Apr 08 '16 at 11:17
  • i will try to find any other way to do it using regEx or i will use java code to achieve it :) – Abdul Apr 08 '16 at 11:20
  • 1
    @Abdul: I hope you will review the previous questions and give credit to people who answered you. – Wiktor Stribiżew Apr 08 '16 at 14:01
  • @Wiktor Stribiżew :- I will credit to them for their effort to support me ... – Abdul Apr 08 '16 at 14:03
  • 1
    @Wiktor Stribiżew :- I checked and accepted their answers :) – Abdul Apr 08 '16 at 14:32

3 Answers3

2

Use

//String key = "A-B-C-D";       // => [A, B, C, D]
String key = "A-B-C-D-E-F"; // => [A_B, C, D, E, F]
int keep = 3;
String[] res = key.split("-");
if (res.length > 4) {
    String first = String.join("-", Arrays.asList(res).subList(0, keep)); 
    List<String> lst = new ArrayList<>();
    lst.add(first);
    lst.addAll(Arrays.asList(res).subList(keep, res.length));
    res = new String[lst.size()];
    res = lst.toArray(res);
}
System.out.println(Arrays.toString(res));

See the IDEONE demo

Basically, I suggest splitting first, and check how many elements we have. Then, just take as many first elements as we need to keep, and then combine this one with the rest.

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
0

You can also try this regex way:

public static void main(String[] args) {
    String input[] = { "A-B-C-D", "A-B-C-D-E-F-E-G", "AAAA-BBB-CCC-DD-EE", "BB-CC-DD-EE" };

    for (String str : input) {
        String output = str.replaceAll("(.*)-([\\w]+?)-([\\w]+?)-([\\w]+?)$", "$1 , $2 , $3 , $4");

        System.out.println("[" + str + "]\t\t\t=> [" + output + "]");
    }
}

OUTPUT:

[A-B-C-D]               => [A , B , C , D]
[A-B-C-D-E-F-E-G]       => [A-B-C-D-E , F , E , G]
[AAAA-BBB-CCC-DD-EE]    => [AAAA-BBB , CCC , DD , EE]
[BB-CC-DD-EE]           => [BB , CC , DD , EE]
Mahendra
  • 1,436
  • 9
  • 15
0

Since you want to have maximum four values after splitting, and you start splitting from the back, you can split by the following regex:

key.split("-(?!([^-]+-){3})");

The regex simply splits by a dash, as long as it can't find 3 dashes ahead. This results in the string being split at the last 3 dashes. Assuming that the input string does not end with dash, the resulting array will have exactly 4 values.

nhahtdh
  • 55,989
  • 15
  • 126
  • 162