3

I know how to remove duplicated characters from a String and keeping the first occurrences without regex:

String method(String s){
  String result = "";
  for(char c : s.toCharArray()){
    result += result.contains(c+"")
     ? ""
     : c;
  }
  return result;
}

// Example input: "Type unique chars!"
// Output:        "Type uniqchars!"

I know how to remove duplicated characters from a String and keeping the last occurrences with regex:

String method(String s){
  return s.replaceAll("(.)(?=.*\\1)", "");
}

// Example input: "Type unique chars!"
// Output:        "Typnique chars!"

As for my question: Is it possible, with a regex, to remove duplicated characters from a String, but keep the first occurrences instead of the last?


As for why I'm asking: I came across this codegolf answer using the following function (based on the first example above):

String f(char[]s){String t="";for(char c:s)t+=t.contains(c+"")?"":c;return t;}

and I was wondering if this can be done shorter with a regex and String input. But even if it's longer, I'm just curious in general if it's possible to remove duplicated characters from a String with a regex, while keeping the first occurrences of each character.

Community
  • 1
  • 1
Kevin Cruijssen
  • 9,153
  • 9
  • 61
  • 135
  • 1
    I can only suggest reversing a string, [`String g(StringBuilder s){return new StringBuilder(s.reverse().toString().replaceAll("(?s)(.)(?=.*\\1)", "")).reverse().toString();}`](https://ideone.com/9B7vIj). – Wiktor Stribiżew Mar 23 '17 at 12:03
  • @WiktorStribiżew Hmm, that's a smart approach. Start with the reversed String, use the regex, and revert it back again. I guess using the for-loop with characters is shorter though, but your function is still a nice approach. With some code-golfing it's 110 bytes: [`String h(StringBuffer s){return""+new StringBuffer((s.reverse()+"").replaceAll("(.)(?=.*\\1)","")).reverse();}`](https://ideone.com/1elfz0) – Kevin Cruijssen Mar 23 '17 at 15:57

1 Answers1

1

It is not the shortest option, and does not only involve a regex, but still an option. You may reverse the string before running the regex you have and then reverse the result back.

public static String g(StringBuilder s){
  return new StringBuilder(
   s.reverse().toString()
     .replaceAll("(?s)(.)(?=.*\\1)", ""))
     .reverse().toString();
}

See the online Java demo

Note I suggest adding (?s) (= Pattern.DOTALL inline modifier flag) to the regex so as . could match any symbol including a newline (a . does not match all line breaks by default).

SaschaM78
  • 4,376
  • 4
  • 33
  • 42
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563