2

I am trying to create a method which will either remove all duplicates from a string or only keep the same 2 characters in a row based on a parameter.

For example:

helllllllo -> helo

or

helllllllo -> hello - This keeps double letters

Currently I remove duplicates by doing:

private String removeDuplicates(String word) {
    StringBuffer buffer = new StringBuffer();
    for (int i = 0; i < word.length(); i++) {
        char letter = word.charAt(i);
        if (buffer.length() == 0 && letter != buffer.charAt(buffer.length() - 1)) {
            buffer.append(letter);
        }
    }
    return buffer.toString();
}

If I want to keep double letters I was thinking of having a method like private String removeDuplicates(String word, boolean doubleLetter)

When doubleLetter is true it will return hello not helo

I'm not sure of the most efficient way to do this without duplicating a lot of code.

Decrypter
  • 2,784
  • 12
  • 38
  • 57
  • StringBuilder could be little better – BlackJoker Apr 17 '13 at 07:47
  • You could also simly increment a counter in the if loop, and add another if loop that only appends the letter if the counter is below a threshold --- this would be a general version working with any number of duplicates (but you should make sure to re-set the counter once another letter than the last one is scanned). – Roland Ewald Apr 17 '13 at 07:48

4 Answers4

5

why not just use a regex?

 public class RemoveDuplicates {
      public static void main(String[] args) {
           System.out.println(new RemoveDuplicates().result("hellllo", false)); //helo
           System.out.println(new RemoveDuplicates().result("hellllo", true)); //hello
      }

      public String result(String input, boolean doubleLetter){
           String pattern = null;
           if(doubleLetter) pattern = "(.)(?=\\1{2})";
           else pattern = "(.)(?=\\1)";
       return input.replaceAll(pattern, "");
      }
 }

 (.)    --> matches any character and puts in group 1. 
 ?=     --> this is called a positive lookahead. 
 ?=\\1  --> positive lookahead for the first group

So overall, this regex looks for any character that is followed (positive lookahead) by itself. For example aa or bb, etc. It is important to note that only the first character is part of the match actually, so in the word 'hello', only the first l is matched (the part (?=\1) is NOT PART of the match). So the first l is replaced by an empty String and we are left with helo, which does not match the regex

The second pattern is the same thing, but this time we look ahead for TWO occurrences of the first group, for example helllo. On the other hand 'hello' will not be matched.

Look here for a lot more: Regex

P.S. Fill free to accept the answer if it helped.

Eugene
  • 117,005
  • 15
  • 201
  • 306
3

try

    String s = "helllllllo";
    System.out.println(s.replaceAll("(\\w)\\1+", "$1"));

output

helo
Evgeniy Dorofeev
  • 133,369
  • 30
  • 199
  • 275
1

Try this, this will be most efficient way[Edited after comment]:

public static String removeDuplicates(String str) {
    int checker = 0;
    StringBuffer buffer = new StringBuffer();
    for (int i = 0; i < str.length(); ++i) {
        int val = str.charAt(i) - 'a';
        if ((checker & (1 << val)) == 0)
            buffer.append(str.charAt(i));
        checker |= (1 << val);
    }
    return buffer.toString();
}

I am using bits to identify uniqueness.

EDIT:

Whole logic is that if a character has been parsed then its corrresponding bit is set and next time when that character comes up then it will not be added in String Buffer the corresponding bit is already set.

Lokesh
  • 7,810
  • 6
  • 48
  • 78
  • the line `buffer.append(val)` must be substituted with the line `buffer.append(str.charAt(i))` (see demo: https://ideone.com/AtYH8a ) and the single quote around the 'a' are wrong, substitute them with the standard one ;) – Andrea Ligios Apr 17 '13 at 08:11
  • You're welcome. Please note that it has errors with more complex input Strings: https://ideone.com/FmiXEe should instead give https://ideone.com/HvogfI :/ Some fixing is needed – Andrea Ligios Apr 17 '13 at 08:16
  • i didn't get this. Can you elaborate? or give an example? – Lokesh Apr 17 '13 at 08:18
  • Compare the outputs; maybe you are removing all the characters present more than one time in the line, not all the characters doubled ASIDE... if so, then it's good – Andrea Ligios Apr 17 '13 at 08:21
  • I will test it further but i feel it will work. Whole logic is that if a character has been parsed then its corrresponding bit is set and next time when that character comes up then it will not be added in String Buffer. So irrespective of location of character in String the logic should work. – Lokesh Apr 17 '13 at 08:26
  • It works exactly like that. While the other (regex) solutions will work by removing doubled characters ASIDE. Giving the source `aabbaa`, your solution will output `ab`, the regex solution will output `aba`. They're made for two slightly different tasks (no matter which is the one asked by OP, that is not clear tbh) – Andrea Ligios Apr 17 '13 at 08:30
1

Taking this previous SO example as a starting point, I came up with this:

    String str1= "Heelllllllllllooooooooooo";
    
    String removedRepeated = str1.replaceAll("(\\w)\\1+", "$1");
    System.out.println(removedRepeated);
    
    String keepDouble = str1.replaceAll("(\\w)\\1{2,}", "$1");
    System.out.println(keepDouble);

It yields:

Helo

Heelo

What it does:

(\\w)\\1+ will match any letter and place it in a regex capture group. This group is later accessed through the \\1+. Meaning that it will match one or more repetitions of the previous letter.

(\\w)\\1{2,} is the same as above the only difference being that it looks after only characters which are repeated more than 2 times. This leaves the double characters untouched.

EDIT: Re-read the question and it seems that you want to replace multiple characters by doubles. To do that, simply use this line:

String keepDouble = str1.replaceAll("(\\w)\\1+", "$1$1");

Community
  • 1
  • 1
npinti
  • 51,780
  • 5
  • 72
  • 96