0

there are two strings:

   str1 = "black_red_yellow"

   str2 = "blue_red_green"

which python library can I use to check these two strings have a substring"_red_" in common? thank you in advance.

cnherald
  • 721
  • 2
  • 8
  • 19
  • 1
    Does this need to work on arbitrary things or is str1 always a substring of str2? – supakeen Mar 16 '12 at 09:24
  • 7
    possible duplicate: http://stackoverflow.com/q/2892931/1025391 – moooeeeep Mar 16 '12 at 09:26
  • In this simple case, `str1 in str2` will evaluate to `True` – John La Rooy Mar 16 '12 at 09:26
  • is this section always delimited by "_" characters as in your example? – LucasB Mar 16 '12 at 09:27
  • 1
    hi all, sorry just updated the question, "_" is not necessary always in the string, can be any characters. The question is how to check if: str1 has something that can be found in str2. thanks. – cnherald Mar 16 '12 at 09:30
  • are you looking for longest common substring? or first common string longer than 1 char? or common words? Last is really easy... Or perhaps a list of substring such that eash `sub in s1 and sub in s2`? – Dima Tisnek Mar 16 '12 at 10:06
  • yes qarma, you are right, I was actually looking for the longest common substring rather than others, so if a condition check that can find both strings have a substring "_red_" in common should return "true". moooeeeep 's comments might help a bit to my question. Sorry all for the ambiguity in the question. – cnherald Mar 16 '12 at 11:57
  • For the sake of simplicity, I can check the substring(any length) only following a specific character, for example "id": str1= " black_id_red_yellow", str2="blue_id_red_green",so because "red" is found after "id_" in both, the algorithm should return "true". – cnherald Mar 16 '12 at 12:14

4 Answers4

5

Something like this should work if you don't know the actual string you're searching for

import difflib

str1 = "black_red_yellow"
str2 = "blue_red_green"

difference = difflib.SequenceMatcher()

difference.set_seqs(str1, str2)

for match in difference.get_matching_blocks():
    print str1[match[0]:match[0] + match[2]]
Joernerama
  • 76
  • 1
2
  1. test for presence of common substring, including length 1:
    if set(str1).intersection(set(str2)): print "yes we can!"
Dima Tisnek
  • 11,241
  • 4
  • 68
  • 120
1

You can use difflib to compare strings in that way. However, if you know the string you're looking for you could just do '_red_' in str1 and '_red_' in str2. If you don't know the string, then do you look for a specific length of match? E.g. would 'red' match 'blue' because they both contain 'e'? The shortest, simplest way of checking for any match at all would be

bool([a for a in str1 if a in str2])

Edit: Or, more efficiently,

any(a for a in str1 if a in str2)
aquavitae
  • 17,414
  • 11
  • 63
  • 106
1

if you can't find anything else, then there's at least this naive implementation:

str1 = "black_red_yellow"
str2 = "blue_red_green"

if len(str1) < len(str2):
    min_str = str1
    max_str = str2
else:
    min_str = str2
    max_str = str1

matches = []
min_len = len(min_str)
for b in xrange(min_len):
    for e in xrange(min_len, b, -1):
        chunk = min_str[b:e]
        if chunk in max_str:
            matches.append(chunk)

print max(matches, key=len)

prints _red_

stepank
  • 456
  • 5
  • 8