1

I want to print duplicate characters from a string using collections(Set) only.

I have written code but it will show the correct result if String is "ashish" but fails if the String is "ashish java" because of occurrence of char 'a' is three times.

public class DuplicateStringMethod {
    public static void duplicateString(String str) {
        char[] cArray = str.toCharArray();
        Set<Character> set = new HashSet<Character>();

        for(char c:cArray) {
            if(set.add(c)==false) {
                System.out.println(c);
            }
        }
    }

    public static void main(String[] args) {
        duplicateString("Java ashishj ");
    }
}

It will print a a s h. But I want a s h using only Set interface.

Stuart Marks
  • 127,867
  • 37
  • 205
  • 259
Ash28
  • 85
  • 1
  • 1
  • 11
  • 1
    a. Make use of Map and count the occurrence. b. Use another set where data is populated for the first time and you look into only when there is no character in other set c. Make use of char array O(1) space and O(1) time complexity – SMA Sep 24 '19 at 13:43
  • 1
    `if(set.add(c)==false)` should be written as `if(!set.add(c))`; just like `if(set.add(c)==true)` should be `if(set.add(c))`. – hfontanez Sep 24 '19 at 14:28

7 Answers7

1

Try that:

public static void duplicateString(String str) {
    Set<Character> firstTime = new HashSet<Character>();
    Set<Character> reported = new HashSet<Character>();

    char[] cArray = str.toCharArray();
    for(char c:cArray) {
        if (!firstTime.contains(c)) {
          firstTime.add(c);
          continue;
        }
        if (reported.contains(c)) { continue; }
        reported.add(c);
        System.out.println(c);
    }
}

Thanks to Holger's suggestion I ran some tests:

  • add: 52443260ns for 10000000 operations
  • contains: 28209745ns for 10000000 operations

Therefore the code above although not shorter is fastest.

Deian
  • 1,237
  • 15
  • 31
  • The `add` method of a `Set` will only add when the element is not already contained in the set and returns a `boolean` telling whether it added the element. Therefore, `contains` checks are obsolete. So you can simply use `for(char c: str.toCharArray()) { if(!firstTime.add(c) && reported.add(c)) System.out.println(c); }` – Holger Sep 24 '19 at 16:22
  • @Holger Yes if contains() and add() are computationally equal your suggestion is totally correct. However quick test shows that add() is twice slower from the contains() and hence the code is better off as it is. – Deian Sep 24 '19 at 19:33
  • Just want understand this condition if (!firstTime.contains(c)) { firstTime.add(c); So, the set 'FirstTime is empty at first then how can you use contains functions and check .add(c)?? Please explain me – Ash28 Sep 25 '19 at 06:29
  • @user3035566 the `!` operator means “not”. – Holger Sep 25 '19 at 07:22
  • @Deian I have strong doubts about the validity of your (undocumented) test. Since you are going to add the element anyway when not contained, only the tests for already contained elements matters. Then, it doesn’t look like you considered all points of [How do I write a correct micro-benchmark in Java?](https://stackoverflow.com/q/504103/2711488). And even if `add` was always almost two times slower than `contains`, for the question’s example string `"ashish java"`, your code performs 15 `contains` checks and 10 `add` operations, compared to the 15 `add` operations of my solution. Do the math… – Holger Sep 25 '19 at 07:39
1

Check this program

public static void duplicateString(String str) {

        char[] cArray = str.replaceAll("\\s+", "").toCharArray();

        Set<Character> set = new HashSet<Character>();
        Set<Character> alreadyExistingSet = new HashSet<Character>();

        for (char c : cArray) {
            if (set.add(c) == false && alreadyExistingSet.add(c) == true) {
                System.out.print(c);
            }
        }
    }
  • Can you explain me this line of code: if (set.add(c) == false && alreadyExistingSet.add(c) == true) { System.out.print(c); So at first, 'a' will be taken and check condition set.add(a) == false which is not satisfying and alreadyExistingSet.add(c) == true which is satisfying so it will check one by one all characters from string "Java ashishj" but, what will happen when it checks character 's' which is second time. – Ash28 Sep 25 '19 at 07:10
  • Don’t use `…== false` or `…== true`. For the first, use `!`, for the second, just remove the obsolete `== true`. – Holger Sep 25 '19 at 07:56
1

All you need to do is use the add() method of the Set class to tell you if the thing being inserted (added) is already part of the set. When the function returns false it means the current "thing" being added is a duplicate. Then, add that to a Set of duplicates. This way, items duplicated more than once, will only show once in the new set. Lastly, to conserve the order, you may want to use a LinkedHashSet to store duplicates.

public class TestDups {

    public static void main (String[] args) {
        String str = "Java ashishj ";
        Set<Byte> myset = new TreeSet<>();
        Set<Character> dups = new LinkedHashSet<>();
        for (byte c: str.getBytes() ) {
            if (!myset.add(c)) {
                dups.add((char)c);
            }
        }

        dups.stream().forEach(System.out::print);
    }
}

The output of the code above is "ash ". Notice the white space at the end since the original String contains two spaces (between words and at the end).

hfontanez
  • 5,774
  • 2
  • 25
  • 37
1

It's not entirely clear to me what's required by "using only the Set interface" but I'll assume that this means that the duplicate characters are to be returned in a Set. There are several ways of doing this. The first is the straightforward loop over the chars of the input string. It takes advantage of the feature of Set.add which returns true if the set was modified and false if it wasn't; that means that an add operation that returns false is a duplicate.

static Set<Character> dups0(String input) {
    Set<Character> dups = new HashSet<>();
    Set<Character> seen = new HashSet<>();
    for (char ch : input.toCharArray()) {
        if (! seen.add(ch)) {
            dups.add(ch);
        }
    }
    return dups;
}

There's a streamy way to do this, which is essentially the same thing expressed in stream form:

static Set<Character> dups1(String input) {
     Set<Character> seen = new HashSet<>();
     return input.chars()
                 .mapToObj(ch -> (char)ch)
                 .filter(ch -> !seen.add(ch))
                 .collect(toSet());
}

Some people might find this distasteful, as its filter operation performs side effects. In addition, if this is run in parallel, the result would need to be something like a ConcurrentHashMap.newKeySet.

An alternative is to generate a frequency table of characters and remove all the entries that occur just once:

static Set<Character> dups2(String input) {
     Map<Character, Long> map = input.chars()
                                     .mapToObj(i -> (char)i)
                                     .collect(groupingBy(ch -> ch, HashMap::new, counting()));
     map.values().removeIf(v -> v == 1);
     return map.keySet();
}

Note that this uses a collections bulk mutation operation on the values-collection view of the map. To ensure that the map is mutable, I've used the three-arg overload of groupingBy to specify the map's implementation type.

If you don't like mutation, there's a pure-streams way to do this:

static Set<Character> dups3(String input) {
    Map<Character, Long> map = input.chars()
                                    .mapToObj(i -> (char)i)
                                    .collect(groupingBy(ch -> ch, counting()));
    return map.entrySet().stream()
              .filter(entry -> entry.getValue() > 1)
              .map(Map.Entry::getKey)
              .collect(toSet());
}
Stuart Marks
  • 127,867
  • 37
  • 205
  • 259
1

Use one more set to store the duplicate element and print the element. Try like this:

public static void duplicateString(String str) {
        str=str.replaceAll(" ","");
        char[] cArray = str.toCharArray();
        Set<Character> set = new HashSet<Character>();
        Set<Character> set1 = new HashSet<Character>();
        for(char c:cArray) {
            if(set.add(c)==false) {
                if(set1.add(c) == true)
                    System.out.println(c);
            }
        }
    }
Madhubala
  • 46
  • 1
  • 7
0
public static void main(String[] args) {
        String s = "Javaashishj";
        char[] cArray = s.toCharArray();
        Set<Character> set = new HashSet<Character>();
        for (char c : cArray) {
            if (!set.contains(c)) {
                set.add(c);
                System.out.println(c);
            }
        }
    }
0

You can make use of String.split() here. This method splits the string up into an array of strings based on the regular expression provided. We will use a "" because we want to split the string after every single character, and then input the results into a stream.

public static void duplicateString( String str ) {

    // collect all characters in a map, whose String value is the character and whose key value is the count of occurrences in the string
    Map<String,Long> charCountMap = Arrays.stream( str.split( "" ) )
            .filter( charInString -> !charInString.equals( " " ) ) // don't want spaces
            .collect( Collectors.groupingBy( Function.identity(), Collectors.counting() ) );

    charCountMap.entrySet()
            .stream()
            .filter( entrySet -> entrySet.getValue() > 1 ) // filter out occurrences that are one or less
            .forEach( entrySet -> System.out.println( String.format( "Char %s appeared %d times", entrySet.getKey(), entrySet.getValue() ) ) );
}
RobOhRob
  • 585
  • 7
  • 17