3

This is what I have tried and somehow I get the feeling that this is not right or this is not the best performing application, so is there a better way to do the searching and fetching the duplicate values from a Map or as a matter of fact any collection. And a better way to traverse through a collection.

public class SearchDuplicates{
    public static void main(String[] args) {
        Map<Integer, String> directory=new HashMap<Integer, String>();
        Map<Integer, String> repeatedEntries=new HashMap<Integer, String>();

        // adding data
        directory.put(1,"john");
        directory.put(2,"michael");
        directory.put(3,"mike");
        directory.put(4,"anna");
        directory.put(5,"julie");
        directory.put(6,"simon");
        directory.put(7,"tim");
        directory.put(8,"ashley");
        directory.put(9,"john");
        directory.put(10,"michael");
        directory.put(11,"mike");
        directory.put(12,"anna");
        directory.put(13,"julie");
        directory.put(14,"simon");
        directory.put(15,"tim");
        directory.put(16,"ashley");

        for(int i=1;i<=directory.size();i++) {
           String result=directory.get(i);
           for(int j=1;j<=directory.size();j++) {
              if(j!=i && result==directory.get(j) &&j<i) {
                 repeatedEntries.put(j, result);
              }
           }
           System.out.println(result);
        }
        for(Entry<Integer, String> entry : repeatedEntries.entrySet()) {
           System.out.println("repeated "+entry.getValue());   
        }
   }
}

Any help would be appreciated. Thanks in advance

Olimpiu POP
  • 5,001
  • 4
  • 34
  • 49
Nick Div
  • 5,338
  • 12
  • 65
  • 127
  • 2
    not related to your question, but you're comparing String with == operator in this line `if(j!=i && result==directory.get(j) &&j – Leo Feb 14 '14 at 07:20
  • possible duplicate [here](http://stackoverflow.com/questions/7414667/identify-duplicates-in-a-list) – DarkHorse Feb 14 '14 at 07:24
  • @Leo Oh yes, thanks, completely forgot. – Nick Div Feb 14 '14 at 07:33
  • If you don't want duplicates in your collection at all, use `java.util.Set`. – Smutje Feb 14 '14 at 07:12
  • That's not what OP asked. The problem is to _find_ the duplicate entries, not avoid them. – Ted Hopp Feb 14 '14 at 07:13
  • Sorry, did not read properly - iterate the `values()` of the Map, copy the `values()`, remove the current iteration object from the `values()`-copy and check whether it is still, `contains()` in it. – Smutje Feb 14 '14 at 07:17

4 Answers4

6

You can use a Set to determine whether entries are duplicate. Also, repeatedEntries might as well be a Set, since the keys are meaningless:

Map<Integer, String> directory=new HashMap<Integer, String>();
Set<String> repeatedEntries=new HashSet<String>();
Set<String> seen = new HashSet<String>();

// ... initialize directory, then:

for(int j=1;j<=directory.size();j++){
    String val = directory.get(j);
    if (!seen.add(val)) {
        // if add failed, then val was already seen
        repeatedEntries.add(val);
    }
}

At the cost of extra memory, this does the job in linear time (instead of quadratic time of your current algorithm).

EDIT: Here's a version of the loop that doesn't rely on the keys being consecutive integers starting at 1:

for (String val : directory.values()) {
    if (!seen.add(val)) {
        // if add failed, then val was already seen
        repeatedEntries.add(val);
    }
}

That will detect duplicate values for any Map, regardless of the keys.

Ted Hopp
  • 232,168
  • 48
  • 399
  • 521
  • Excellent thank you. I didn't know adding a value to a set returns a boolean. Got to learn something new. – Nick Div Feb 14 '14 at 07:22
  • A question about this. The add method of the set compare the elements first with its hash. If it is equals compare the object. This is not a loop? It is a quadratic time algorithm too right? – Reik Val Feb 14 '14 at 07:29
  • @ReikVal - That's how a sane `HashSet` implementation should work internally (except for `null` values, if the `Set` implementation supports them). However, the only part that's specified in [the docs](http://docs.oracle.com/javase/7/docs/api/java/util/Set.html#add%28E%29) is that values are compared using `equals()`. – Ted Hopp Feb 14 '14 at 07:34
  • @ReikVal - From [the docs for `HashSet`](http://docs.oracle.com/javase/7/docs/api/java/util/HashSet.html): _"This class offers constant time performance for the basic operations (add, remove, contains and size), assuming the hash function disperses the elements properly among the buckets."_ So no, it is not a quadratic time algorithm. The body of the outer loop executes in constant time (up to a factor of 2), so it's a linear time algorithm. – Ted Hopp Feb 14 '14 at 09:04
1

You can use this to found word count

    Map<String, Integer> repeatedEntries = new HashMap<String, Integer>();
    for (String w : directory.values()) {
        Integer n = repeatedEntries.get(w);
        n = (n == null) ? 1 : ++n;
        repeatedEntries.put(w, n);
    }

and this to print the stats

    for (Entry<String, Integer> e : repeatedEntries.entrySet()) {
        System.out.println(e);
    }
Evgeniy Dorofeev
  • 133,369
  • 30
  • 199
  • 275
  • Your answer seems to work, could you please elaborate what exactly "n = (n == null) ? 1 : ++n;" this does? – Nick Div Feb 14 '14 at 07:21
  • @kapilchhattani - It assigns 1 to `n` if `n` is `null`; otherwise it assigns `the value of ++n` to `n`. The effect is to treat a new key `w` as having an initial count of 0. – Ted Hopp Feb 14 '14 at 07:23
  • @TedHopp I don't exactly know if this would work according to what I am looking for but I will give this a try. By the way your code works great. – Nick Div Feb 14 '14 at 07:31
1

List, Vector have a method contains(Object o) which return Boolean value based either this object is exist in collection or not.

Sabbir
  • 126
  • 7
1

You can use Collection.frequency to find all possible duplicates in any collection using

Collections.frequency(list, "a")

Here is a proper example

Most generic method to find

Set<String> uniqueSet = new HashSet<String>(list);
    for (String temp : uniqueSet) {
        System.out.println(temp + ": " + Collections.frequency(list, temp));
    }

References from above link itself

DarkHorse
  • 2,740
  • 19
  • 28