117

I have two lists ( not java lists, you can say two columns)

For example

**List 1**            **Lists 2**
  milan                 hafil
  dingo                 iga
  iga                   dingo
  elpha                 binga
  hafil                 mike
  meat                  dingo
  milan
  elpha
  meat
  iga                   
  neeta.peeta    

I'd like a method that returns how many elements are same. For this example it should be 3 and it should return me similar values of both list and different values too.

Should I use hashmap if yes then what method to get my result?

Please help

P.S: It is not a school assignment :) So if you just guide me it will be enough

user238384
  • 2,396
  • 10
  • 35
  • 36

11 Answers11

179

EDIT

Here are two versions. One using ArrayList and other using HashSet

Compare them and create your own version from this, until you get what you need.

This should be enough to cover the:

P.S: It is not a school assignment :) So if you just guide me it will be enough

part of your question.

continuing with the original answer:

You may use a java.util.Collection and/or java.util.ArrayList for that.

The retainAll method does the following:

Retains only the elements in this collection that are contained in the specified collection

see this sample:

import java.util.Collection;
import java.util.ArrayList;
import java.util.Arrays;

public class Repeated {
    public static void main( String  [] args ) {
        Collection listOne = new ArrayList(Arrays.asList("milan","dingo", "elpha", "hafil", "meat", "iga", "neeta.peeta"));
        Collection listTwo = new ArrayList(Arrays.asList("hafil", "iga", "binga", "mike", "dingo"));

        listOne.retainAll( listTwo );
        System.out.println( listOne );
    }
}

EDIT

For the second part ( similar values ) you may use the removeAll method:

Removes all of this collection's elements that are also contained in the specified collection.

This second version gives you also the similar values and handles repeated ( by discarding them).

This time the Collection could be a Set instead of a List ( the difference is, the Set doesn't allow repeated values )

import java.util.Collection;
import java.util.HashSet;
import java.util.Arrays;

class Repeated {
      public static void main( String  [] args ) {

          Collection<String> listOne = Arrays.asList("milan","iga",
                                                    "dingo","iga",
                                                    "elpha","iga",
                                                    "hafil","iga",
                                                    "meat","iga", 
                                                    "neeta.peeta","iga");

          Collection<String> listTwo = Arrays.asList("hafil",
                                                     "iga",
                                                     "binga", 
                                                     "mike", 
                                                     "dingo","dingo","dingo");

          Collection<String> similar = new HashSet<String>( listOne );
          Collection<String> different = new HashSet<String>();
          different.addAll( listOne );
          different.addAll( listTwo );

          similar.retainAll( listTwo );
          different.removeAll( similar );

          System.out.printf("One:%s%nTwo:%s%nSimilar:%s%nDifferent:%s%n", listOne, listTwo, similar, different);
      }
}

Output:

$ java Repeated
One:[milan, iga, dingo, iga, elpha, iga, hafil, iga, meat, iga, neeta.peeta, iga]

Two:[hafil, iga, binga, mike, dingo, dingo, dingo]

Similar:[dingo, iga, hafil]

Different:[mike, binga, milan, meat, elpha, neeta.peeta]

If it doesn't do exactly what you need, it gives you a good start so you can handle from here.

Question for the reader: How would you include all the repeated values?

OscarRyz
  • 196,001
  • 113
  • 385
  • 569
  • @Oscar, My exact thought, but I was not sure if we could have modified the contents of `listOne`, but +1 anyways! – Anthony Forloney May 04 '10 at 00:53
  • @poygenelubricants what do you mean by *raw types* not generics? Why not? – OscarRyz May 04 '10 at 00:58
  • Oscar, did you see my updated question? Does it support repeated values? – user238384 May 04 '10 at 00:59
  • @Oscar: http://java.sun.com/docs/books/jls/third_edition/html/typesValues.html#4.8 "The use of raw types in code written after the introduction of genericity into the Java programming language is strongly discouraged. It is possible that future versions of the Java programming language will disallow the use of raw types." – polygenelubricants May 04 '10 at 01:04
  • @agazerboy, just now, I'm updating the question to also include the "similar" values – OscarRyz May 04 '10 at 01:04
  • 2
    @polygenelubricants answer updated to handle duplicates and raw types. BTW, the *..future version of Java...* is never going to happen. ;) – OscarRyz May 04 '10 at 01:13
  • Hi Oscar, Thanks for your help. I think your first version works well. In your second version, if there are repeated values it skip them and show result as 3 similar values. While iga is 3 times in list one. So the total should be 5 or more for similar values. I hope my point is clear, what do you say? – user238384 May 04 '10 at 01:14
  • So, do you mean that repeated values should appear? iga is repeated 6 times in the first list, should it show those 6 times? That's easy, just replace `HashSet` with `ArrayList` and you're done. BTW, **I think that with this help can can figure out the rest.** – OscarRyz May 04 '10 at 01:15
  • With that change ( s/HashSet/ArrayList ) you should get this output: http://pastebin.com/kxTWNeUx – OscarRyz May 04 '10 at 01:22
  • Oscar, thanks for your help. It is a very gud start for solving BIG PROBLEM :) – user238384 May 04 '10 at 01:31
44

You can try intersection() and subtract() methods from CollectionUtils.

intersection() method gives you a collection containing common elements and the subtract() method gives you all the uncommon ones.

They should also take care of similar elements

Holger
  • 285,553
  • 42
  • 434
  • 765
Mihir Mathuria
  • 6,479
  • 1
  • 22
  • 15
15

If you are looking for a handy way to test the equality of two collections, you can use org.apache.commons.collections.CollectionUtils.isEqualCollection, which compares two collections regardless of the ordering.

snowfox
  • 1,978
  • 1
  • 21
  • 21
12

Of all the approaches, I find using org.apache.commons.collections.CollectionUtils#isEqualCollection is the best approach. Here are the reasons -

  • I don't have to declare any additional list/set myself
  • I am not mutating the input lists
  • It's very efficient. It checks the equality in O(N) complexity.

If it's not possible to have apache.commons.collections as a dependency, I would recommend to implement the algorithm it follows to check equality of the list because of it's efficiency.

shakhawat
  • 2,639
  • 1
  • 20
  • 36
11

Are these really lists (ordered, with duplicates), or are they sets (unordered, no duplicates)?

Because if it's the latter, then you can use, say, a java.util.HashSet<E> and do this in expected linear time using the convenient retainAll.

    List<String> list1 = Arrays.asList(
        "milan", "milan", "iga", "dingo", "milan"
    );
    List<String> list2 = Arrays.asList(
        "hafil", "milan", "dingo", "meat"
    );

    // intersection as set
    Set<String> intersect = new HashSet<String>(list1);
    intersect.retainAll(list2);
    System.out.println(intersect.size()); // prints "2"
    System.out.println(intersect); // prints "[milan, dingo]"

    // intersection/union as list
    List<String> intersectList = new ArrayList<String>();
    intersectList.addAll(list1);
    intersectList.addAll(list2);
    intersectList.retainAll(intersect);
    System.out.println(intersectList);
    // prints "[milan, milan, dingo, milan, milan, dingo]"

    // original lists are structurally unmodified
    System.out.println(list1); // prints "[milan, milan, iga, dingo, milan]"
    System.out.println(list2); // prints "[hafil, milan, dingo, meat]"
polygenelubricants
  • 376,812
  • 128
  • 561
  • 623
  • well I really don't know which data structure it should be. It has duplicates. Now you can see updated question – user238384 May 04 '10 at 00:57
  • Will it remove the repeated values from data set? coz I don't want to loss any value :( – user238384 May 04 '10 at 00:59
  • @agazerboy: I've tried to address both questions. Feel free to ask for more clarifications. – polygenelubricants May 04 '10 at 01:02
  • thanks poly. I tried your program with duplicates for example in first list i added "iga" two times but still it return me 3 as an answer. While it should be 4 now. coz list 1 has 4 similar values. If i added one entry multiple time it should work. What do you say? Anyother data structure? – user238384 May 04 '10 at 01:07
7

Using java 8 removeIf

public int getSimilarItems(){
    List<String> one = Arrays.asList("milan", "dingo", "elpha", "hafil", "meat", "iga", "neeta.peeta");
    List<String> two = new ArrayList<>(Arrays.asList("hafil", "iga", "binga", "mike", "dingo")); //Cannot remove directly from array backed collection
    int initial = two.size();

    two.removeIf(one::contains);
    return initial - two.size();
}
Asanka Siriwardena
  • 871
  • 13
  • 18
6

Simple solution :-

    List<String> list = new ArrayList<String>(Arrays.asList("a", "b", "d", "c"));
    List<String> list2 = new ArrayList<String>(Arrays.asList("b", "f", "c"));

    list.retainAll(list2);
    list2.removeAll(list);
    System.out.println("similiar " + list);
    System.out.println("different " + list2);

Output :-

similiar [b, c]
different [f]
Vijay
  • 4,694
  • 1
  • 30
  • 38
1

I found a very basic example of List comparison at List Compare This example verifies the size first and then checks the availability of the particular element of one list in another.

Manoj Kumar
  • 111
  • 3
1

Assuming hash1 and hash2

List< String > sames = whatever
List< String > diffs = whatever

int count = 0;
for( String key : hash1.keySet() )
{
   if( hash2.containsKey( key ) ) 
   {
      sames.add( key );
   }
   else
   {
      diffs.add( key );
   }
}

//sames.size() contains the number of similar elements.
Stefan Kendall
  • 66,414
  • 68
  • 253
  • 406
  • He wants the list of identical keys, not how many keys are identical. I think. – Rosdi Kasim May 04 '10 at 00:42
  • Thanks stefan for your help. Yeah Rosdi is correct and you as well. I need total number of similar values and similar values as well. – user238384 May 04 '10 at 00:44
  • Use libraries, this method is very amateur, look this method: List list = List.of(1,2,3,4,5); List target = List.of(1,2,3,40,50); List result = list.stream().filter(v -> !target.contains(v)).collect(Collectors.toList()); // Prints [4, 5] - items in the list not found in the target System.out.println(result); – AAI INGENIERIA Apr 26 '23 at 19:05
0
public static boolean compareList(List ls1, List ls2){
    return ls1.containsAll(ls2) && ls1.size() == ls2.size() ? true :false;
     }

public static void main(String[] args) {

    ArrayList<String> one = new ArrayList<String>();
    one.add("one");
    one.add("two");
    one.add("six");

    ArrayList<String> two = new ArrayList<String>();
    two.add("one");
    two.add("six");
    two.add("two");

    System.out.println("Output1 :: " + compareList(one, two));

    two.add("ten");

    System.out.println("Output2 :: " + compareList(one, two));
  }
0
protected <T> boolean equals(List<T> list1, List<T> list2) {
  
        if (list1 == list2) {
            return true;
        }
 
        if (list1 == null || list2 == null || list1.size() != list2.size()) {
            return false;
        }
       // to prevent wrong results on {a,a,a} and {a,b,c} 
       // iterate over list1 and then list2
        return list1.stream()
                .filter(val -> !list2.contains(val))
                .collect(Collectors.toList())
                .isEmpty()  &&
               list2.stream()
                .filter(val -> !list1.contains(val))
                .collect(Collectors.toList())
                .isEmpty();
    }