General test
The answers provide several solutions, so I decided to figure out which one is the most effective.
Solutions
- HashSet based by
Óscar López
- Stream based by
Bilesh Ganguly
- Foreach based by
Ruchira Gayan Ranaweera
- HashMap based by
ikarayel
What we have
- Two
String
arrays that contain 50% of the common elements.
- Every element in each array is unique, so there are no duplicates
Testing code
public static void startTest(String name, Runnable test){
long start = System.nanoTime();
test.run();
long end = System.nanoTime();
System.out.println(name + ": " + (end - start) / 1000000. + " ms");
}
With use:
startTest("HashMap", () -> intersectHashMap(arr1, arr2));
startTest("HashSet", () -> intersectHashSet(arr1, arr2));
startTest("Foreach", () -> intersectForeach(arr1, arr2));
startTest("Stream ", () -> intersectStream(arr1, arr2));
Solutions code:
HashSet
public static String[] intersectHashSet(String[] arr1, String[] arr2){
HashSet<String> set = new HashSet<>(Arrays.asList(arr1));
set.retainAll(Arrays.asList(arr2));
return set.toArray(new String[0]);
}
Stream
public static String[] intersectStream(String[] arr1, String[] arr2){
return Arrays.stream(arr1)
.distinct()
.filter(x -> Arrays.asList(arr2).contains(x))
.toArray(String[]::new);
}
Foreach
public static String[] intersectForeach(String[] arr1, String[] arr2){
ArrayList<String> result = new ArrayList<>();
for(int i = 0; i < arr1.length; i++){
for(int r = 0; r < arr2.length; r++){
if(arr1[i].equals(arr2[r]))
result.add(arr1[i]);
}
}
return result.toArray(new String[0]);
}
HashMap
public static String[] intersectHashMap(String[] arr1, String[] arr2){
HashMap<String, Integer> map = new HashMap<>();
for (int i = 0; i < arr1.length; i++)
map.put(arr1[i], 1);
ArrayList<String> result = new ArrayList<>();
for(int i = 0; i < arr2.length; i++)
if(map.containsKey(arr2[i]))
result.add(arr2[i]);
return result.toArray(new String[0]);
}
Testing process
Let's see what happens if we give the methods an array of 20
elements:
HashMap: 0.105 ms
HashSet: 0.2185 ms
Foreach: 0.041 ms
Stream : 7.3629 ms
As we can see, the Foreach method does the best job. But the Stream method is almost 180 times slower.
Let's continue the test with 500
elements:
HashMap: 0.7147 ms
HashSet: 4.882 ms
Foreach: 7.8314 ms
Stream : 10.6681 ms
In this case, the results have changed dramatically. Now the most efficient is the HashMap method.
Next test with 10 000
elements:
HashMap: 4.875 ms
HashSet: 316.2864 ms
Foreach: 505.6547 ms
Stream : 292.6572 ms
The fastest is still the HashMap method. And the Foreach method has become quite slow.
Results
If there are < 50 elements, then it is best to use the Foreach
method. He strongly breaks away in speed in this category.
In this case, the top of the best will look like this:
Foreach
HashMap
HashSet
Stream
- Better not to use in this case
But if you need to process big data, then the best option would be use the HashMap
based method.
So the top of the best look like this:
HashMap
HashSet
Stream
Foreach