0

I have a helper function which finds the indexes of duplicate characters in the String. Now whats the best way to remove these duplicates? Thanks!

uoftcode
  • 27
  • 5

4 Answers4

2

This is the best way I know to do it. It takes a string, separates it into characters, put it into a hashset (nonrepeating, ordered) and then prints (or could return the string.

This is the best way out of the ones listed

String example = "thiscode";
char[] chars = example.toCharArray();
Set<Character> str = new LinkedHashSet<Character>();
for (char c : chars) {
    str.add(c);
}

StringBuilder sb = new StringBuilder();
for (Character character : str) {
    sb.append(character);
}
System.out.println(sb.toString());

Alternatively:

public static String convert(String example){
    char[] chars = example.toCharArray();
    Set<Character> str = new LinkedHashSet<Character>();
    for (char c : chars) {
        str.add(c);
    }

    StringBuilder sb = new StringBuilder();
    for (Character character : str) {
        sb.append(character);
    }
    return sb.toString();
}

Another way to do it:

    String example = "thiscode";
    StringBuilder sb = new StringBuilder(example);
    for (int i=0; i<example.length(); i++)                            //itterate throught the characters
        if (!sb.toString().contains(example.charAt(i) + ""))          //determine if its in the stringbuilder
            sb.append(example.charAt(i));                             //if not add it
    example = sb.toString();                                          //take result
    System.out.println(example);

Inefficient, but easy implementation

String example = "thiscode";
String empty = "";
boolean alphabet[] = new boolean[26];
for (char c : example.toCharArray())
    if (alphabet[(int) ((c + "").toLowerCase().charAt(0) - 'a')] == false)
        empty += c;
example = empty;
System.out.println(example);

Hope this helps.

lacraig2
  • 655
  • 4
  • 18
  • in terms of your "the best way" I have a similar answer but no need to loop over it twice or to keep the hashset ordered as you can use the set.add method to check if you want to append the character at the same time as adding the to set. – Sean F Mar 17 '15 at 03:47
1

You can create a set of the chracters used and utilize the add method as it returns false if the set already contains the value listed, no reason to loop over the elements more than once

    String input = "somesortoftestwords";
    Set<Character> charSet = new HashSet<Character>();
    StringBuilder sb = new StringBuilder();
    for (char c : input.toCharArray()) {
        if (charSet.add(c)){
            sb.append(c);
        }
    }
    System.out.println(sb.toString());
Sean F
  • 2,352
  • 3
  • 28
  • 40
  • I was doing this as part of interview preparation and found HashSet won't work as it doesn't preserve the order. UseLinkedHashSet instead of HashSet to maintain the insertion order else {"bcdbdbcdbasbabccbdcbdsadas" -> "bcdas"} fails. – jaamit Feb 29 '16 at 04:07
  • @jaamit "bcdas" is the correct output, also as the hash set is only used to see if the character is used, insertion order into it is completely irrelevant, the hashset is not used for output. – Sean F Feb 29 '16 at 09:19
  • I guess I should have given more details... what I meant was {"bcdbdbcdbasbabccbdcbdsadas" -> "bcdas"} was my unit test case and it failed when I used HashSet. It passed when I used LinkedHashSet.. ` Set set = new LinkedHashSet<>(); for(Character c : str.toCharArray()) { if(!set.contains(c)) set.add(c); } // set has unique elems - convert to output String for(Character c : set) { sb.append(c); }` – jaamit Feb 29 '16 at 16:11
  • @jaamit why are you creating a second for loop, in my example there is no need to loop over the hashmap. I dont get why you are looping a second time? Juat so you can use the linkedhashmap for output for some reason? – Sean F Mar 01 '16 at 05:07
  • I see your point... your code is crisp and clean and avoids looping twice. I missed that. Thanks – jaamit Mar 01 '16 at 05:37
0

Another Possible Solution:

  1. Convert the String into characters

    char[] charz = inputString.toCharArray();

  2. Sort the characters

    Arrays.sort(charz);

  3. Now use a loop and check for duplicates

MJSG
  • 1,035
  • 8
  • 12
0

In the comments of one of the answers here there was talk of removing chars from a StringBuilder without the changing indexes causing problems. So I wrote this. I'm not saying this is the best way to process a String. I would used the Linked Set solution or something similar. (So don't vote down - or up for that matter :)

Here we loop for StringBuilder length -1 and check all chars AFTER that for duplicates and remove them. we only go to -1 because the last char doesn't have anything after it to check. The len - 1 calculation is done every time in the for loop, so it can't run over when chars are removed.

public String removeDuplicates(String string) {
    StringBuilder stringBuilder = new StringBuilder(string);
    for (int x=0;x<stringBuilder.length()-1;x++) {
        String character = Character.toString(stringBuilder.charAt(x));
        int i;
        while ((i = stringBuilder.indexOf(character, x+1)) != -1) {
            stringBuilder.replace(i, i+1, "");
        }
    }
    return stringBuilder.toString();
}
slipperyseal
  • 2,728
  • 1
  • 13
  • 15