How to check If a string contains a part of another string? (not the whole string)

Question

I have 2 strings:

1) John has 2 apples.

2) Cody plays xbox in John's basement.

Now these 2 strings have "John" in common

But there seems to be no programmatic way to check this. The closest I could get is how to check if a string contains a specific word: str1.toLowerCase().contains(str2.toLowerCase())

So how do I check If String 1) contains part of String 2)?

you need to more specific, note that both strings contain an `" "` as well, or a `"o"` and `"p"`, etc. — luk2302, Apr 04 '18 at 20:10
@luk2302 I never thought about that and that Does make sense. Maybe this is the reason why this method doesn't exist — user9598926, Apr 04 '18 at 20:11
@luk2302 can I separate the words using special characters and then check IF those words are contained in string 1? — user9598926, Apr 04 '18 at 20:15
`str1.toLowerCase().contains(str2.toLowerCase()) && ! str1.equalsIgnoreCase(str2)` — Robert, Apr 04 '18 at 20:15
I'm pretty sure there's no efficient way of doing this (without something that's O(n^2) complexity. As you could imagine, you'd probably have to make a matrix of the two (char by char) and do some checking for double letters in diagonals. EDIT never-mind, @Pablo says it can be done in O(m + n). — Alexander Kleinhans, Apr 04 '18 at 20:17
@Pablo Is that a duplicate? The other question is about common characters. This question is about common parts which is a different thing. — lexicore, Apr 04 '18 at 20:24
@user9598926 You have to be specific. Are you looking for the longest common substring? — lexicore, Apr 04 '18 at 20:25
@1N5818 O(m+n) is probably for a different task. As for this I think tries can help a lot. — lexicore, Apr 04 '18 at 20:26
@user9598926, This is a nice question but you have to define what do you mean by "part". — Mehdi, Apr 04 '18 at 20:30

Mohammad · Answer 1 · 2018-04-04T21:52:35.457

1

This could work

public static void main(String[] args) {
    String x = "John has 2 apples.";
    String y = "Cody plays xbox in John's basement.";
    // print words of x that matches any of y
    findMatch(Arrays.asList(x.split(" ")), y);
    // print words of y that matches any of x
    findMatch(Arrays.asList(y.split(" ")), x);

}

private static void findMatch(List<String> firstArr,String statement) {
    for (String string : firstArr) {
        if(statement.contains(string)) {
            System.out.println(string);
        }
    }
}

edited Apr 04 '18 at 21:52

answered Apr 04 '18 at 20:22

Mohammad

739
7
22

Note that in the reverse order : search any word of `y` in `x` will not work as `"John's"` is not contained in `"John has 2 apples."` – davidxxx Apr 04 '18 at 20:31
the op clarified that he wants this " how do I check If String 1) contains part of String 2)? " @davidxxx – Mohammad Apr 04 '18 at 20:34
It is an example. A code should work in multiple "normal" cases otherwise it is hardcoded, which makes no sense. So why not just output "John" ? – davidxxx Apr 04 '18 at 20:36
can both strings be split into words and checked if string 2 contains words from string 1? – user9598926 Apr 04 '18 at 21:05
thank you @davidxxx for the explanation , I've edited my code for both cases , if there is another approach to do this could you add it to answers here ? – Mohammad Apr 04 '18 at 21:54

score 0 · Answer 2 · answered Apr 04 '18 at 20:28

If you're talking about two strings, you would have to create a matrix of each with respect to eachother, i.e.

1) "abcd"

2) "cdef"

Matrix)

ac bc cc dc
ad bd cd dd
ae be ce de
af bc cf df

Then check for doubles to show up in diagnal patterns, (i.e., cc and dd ) in the above example.

This means, you will have to iterate through each string for each char of the other string, so I believe this would give you O(n^2) time complexity. For each diagonal match up, that would be a token that matches (and there could be more than one). This is not the same thing as longest common substring, as @lexicore said.

If they are really large strings, you probably wouldn't want to go through each char, but instead tokenize them (i.e. by splitting them by spaces) and creating a sorted list for each (or hash table or something) so you could itterate through each one in O(log n)ish time. I think that would give you something like O((log n)^2), but at least better than O(n^2).

Pablo · Answer 3 · 2019-07-23T08:58:32.273

Based on https://stackoverflow.com/a/4448435/3790546:

private static List<String> interection(String s1, String s2) {
    HashSet<String> h1, h2;
    h1 = toHashSet(s1);
    h2 = toHashSet(s2);
    h1.retainAll(h2);
    return h1.stream().collect(toList());
}

private static HashSet<String> toHashSet(String s) {
    return Arrays.stream(s.split("[\\s@&.'?$+-]+")).collect(Collectors.toCollection(HashSet::new));
}

public static void main (String [] args) {
    interection("John has 2 apples.", "Cody plays xbox in John's basement.").forEach(System.out::println);
}

How to check If a string contains a part of another string? (not the whole string)

3 Answers3