Intersection of two strings in Java

Question

Need a Java function to find intersection of two strings. i.e. characters common to the strings.

Example:

String s1 = new String("Sychelless");
String s2 = new String("Sydney");

score 25 · Accepted Answer · answered Dec 15 '10 at 09:29

25

Using HashSet<Character>:

HashSet<Character> h1 = new HashSet<Character>(), h2 = new HashSet<Character>();
for(int i = 0; i < s1.length(); i++)                                            
{
  h1.add(s1.charAt(i));
}
for(int i = 0; i < s2.length(); i++)
{
  h2.add(s2.charAt(i));
}
h1.retainAll(h2);
Character[] res = h1.toArray(new Character[0]);

This is O(m + n), which is asymptotically optimal.

answered Dec 15 '10 at 09:29

Matthew Flaschen

278,309
50
514
539

what is O(m+n) and how do u arrive at that complexity and what do mean by asymptotically optimal – Deepak Dec 15 '10 at 09:34
13

`m` and `n` are the lengths of the strings. Adding, removing, and checking a HashSet item are all O(1). So it's basically m (add) + n (add) + m (one check for every element in h1) + m (up to m removals) + m (converting to character array). So that's 4m + n, which is O(m + n) (the constant drops out). O is Big O. Asymptotically optimal means that there is no algorithm with a smaller big O. – Matthew Flaschen Dec 15 '10 at 09:37
2

The "O(m + n)" stuff is used for analyzing the time and space requirements of algorithms. See: http://en.wikipedia.org/wiki/Big_O_notation – Jesper Dec 15 '10 at 13:46
@MonaJalal, it's just required so that toArray returns an array of type Character. This is a consequence of Java's somewhat incomplete generic system (it was added on later, so they couldn't do everything a new system like C# could). See https://docs.oracle.com/javase/7/docs/api/java/util/AbstractCollection.html#toArray%28T[]%29 . – Matthew Flaschen Oct 24 '15 at 23:05
1

This approach of set fails for following input "aa","aa" . – pravin kottawar May 12 '19 at 05:00

score 8 · Answer 2 · answered Dec 15 '10 at 09:29

8

Extract the characters

String.toCharArray

Put them in a Set Find the intersection

Set.retainAll

answered Dec 15 '10 at 09:29

saugata

2,823
1
27
39

score 7 · Answer 3 · edited Aug 21 '12 at 18:41

7

Most basic approach:

String wordA = "Sychelless";  
String wordB = "Sydney";  
String common = "";  

for(int i=0;i<wordA.length();i++){  
    for(int j=0;j<wordB.length();j++){  
        if(wordA.charAt(i)==wordB.charAt(j)){  
            common += wordA.charAt(i)+" ";  
            break;
        }  
    }  
}  
System.out.println("common is: "+common);

edited Aug 21 '12 at 18:41

abatishchev

98,240
88
296
433

answered Dec 15 '10 at 09:23

jmj

237,923
42
401
438

1

Output from this is `"common is: S y e e "`. I would not expect a correct solution to include duplicates (at least not if both words don't have the letters multiple times). – Rasmus Faber Dec 15 '10 at 09:39
2

This is `O(m * n)` because of the nested loop. Separately, you can easily make the String creation more efficient with a `StringBuilder`, since you know the upper bound of the size ahead of time. Rasmus is also right that the intersection should be a set (no duplicates). – Matthew Flaschen Dec 15 '10 at 09:39
The duplication can be fixed very easy adding a condition one line before appending to the "common" String like: if(!common.contains(wordA.charAt(i)+"")) – Abel Morelos Mar 12 '12 at 17:59

score 3 · Answer 4 · answered Dec 15 '10 at 09:43

More detail on saugata's response (appeared while I was writing this): -

public static void main(String[] args) {
    String s1 = "Seychelles";
    String s2 = "Sydney";
    Set<Character> ss1 = toSet(s1);
    ss1.retainAll(toSet(s2));
    System.out.println(ss1);
}

public static Set<Character> toSet(String s) {
    Set<Character> ss = new HashSet<Character>(s.length());
    for (char c : s.toCharArray())
        ss.add(Character.valueOf(c));
    return ss;
}

score 2 · Answer 5 · answered Dec 15 '10 at 09:23

2

I think the algorithm you are looking for is the problem of the longest common subsequence

answered Dec 15 '10 at 09:23

Armen Tsirunyan

130,161
59
324
434

the solution is posted by above members – Deepak Dec 15 '10 at 09:32

Sergii Shevchyk · Answer 6 · 2013-06-27T09:13:21.570

1

By means of Guava this task seems much easier:

String s1 = new String("Sychelless");
String s2 = new String("Sydney");
Set<String> setA = Sets.newHashSet(Splitter.fixedLength(1).split(s1));
Set<String> setB = Sets.newHashSet(Splitter.fixedLength(1).split(s2));
Sets.intersection(setA, setB);

edited Jun 27 '13 at 09:13

answered Mar 28 '13 at 15:36

Sergii Shevchyk

38,716
12
50
61

score 1 · Answer 7 · edited May 23 '17 at 11:46

1

Found same question here, refer this

Implementing an efficent algorithm to find the intersection of two strings

edited May 23 '17 at 11:46

Community

1
1

answered Dec 15 '10 at 09:23

Vicky

9,515
16
71
88

This is for c# though, but the algorithm is the same – Shervin Asgari Dec 15 '10 at 10:06

Bishal Jaiswal · Answer 8 · 2019-01-15T11:22:26.667

Optimized solution:

public static String twoStrings(String s1, String s2){

    HashSet<Character> stringOne =  new HashSet<Character>(), stringTwo = new HashSet<Character>();  
    int stringOneLength = s1.length();
    int stringTwoLength = s2.length();
    for(int i=0; i<stringOneLength || i<stringTwoLength; i++) {
        if(i < stringOneLength)
            stringOne.add(s1.charAt(i));
        if(i < stringTwoLength)
            stringTwo.add(s2.charAt(i));
    }
    stringOne.retainAll(stringTwo);

    return stringOne.toString();
}

Blasanka · Answer 9 · 2017-07-23T04:25:27.700

I have used TreeSet. And retainAll() in TreeSet to get matched elements.

Oracle Doc:
retainAll(Collection<?> c)
Retains only the elements in this set that are contained in the specified collection (optional operation).

String s1 = new String("Sychelless");
String s2 = new String("Sydney");

Set<Character> firstSet = new TreeSet<Character>();
for(int i = 0; i < s1.length(); i++) {
    firstSet.add(s1.charAt(i));
}

Set<Character> anotherSet = new TreeSet<Character>();
for(int i = 0; i < s2.length(); i++) {
    anotherSet.add(s2.charAt(i));
}

firstSet.retainAll(anotherSet);
System.out.println("Matched characters are " + firstSet.toString());//print common strings

//output > Matched characters are [S, e, y]

score -5 · Answer 10 · edited Aug 17 '12 at 11:30

-5

s1.contains(s2) returns true;
s1.indexOf(s2) returns 0. 
s1.indexOf("foo") returns -1

For more sophisticated cases use class Pattern.

edited Aug 17 '12 at 11:30

fglez

8,422
4
47
78

answered Dec 15 '10 at 09:23

AlexR

114,158
16
130
208

1

how it will give common chars – jmj Dec 15 '10 at 09:24
It gives substring and I am 99% sure that it is what he needs. – AlexR Dec 15 '10 at 09:29

Intersection of two strings in Java

10 Answers10

Linked

Related