2

I have a list of substrings and a list of strings. I would like to find all matching substrings in the list of strings. When substrings are found in the strings I would like to create a new list of strings containing all substring matches found in each string.

For example let's say I have these:

substrings = ["word","test"]

strings = ["word string one", "string two test", "word and test", "no matches in this string"]

I have created the following to match the substrings with the string:

for s in strings:
for k in substrings:
    if k in s:
        print(k)

This give the following output:

word
test
word
test 

I have also tried the following:

matches = [x for string in strings for x in string.split() if x in substrings]
print (matches)

output:

['word', 'test', 'word', 'test']

None of these results are what I am after. As both "word" and "test" occur in the third string I would like to get something similar to either of the following outputs:

word
test
word, test 

or

['word', 'test', 'word test']
G.pyth
  • 23
  • 3

2 Answers2

1

For the first example, you just have to print it without a newline and then print newline at the end of the first cycle.

How to print without newlines: How to print without newline or space?

Community
  • 1
  • 1
Charlestone
  • 1,248
  • 1
  • 13
  • 27
1

Your code isn't giving you the result you want because you are not keeping multiple matches together in their own list.

The easiest way of achieving what you are looking for is to keep another list inside the loop to contain substrings that matches the current string.

substrings = ["word","test"]

strings = ["word string one",
           "string two test",
           "word and test",
           "no matches in this string"]

result = []    

for string in strings:
    matches = []
    for substring in substrings:
        if substring in string:
            matches.append(substring)
    if matches:
        result.append(matches)

This should give you

[['word'], ['test'], ['word', 'test']]

If you want to actually print these in the format you stated in your question, simply change

result.append(matches)

to

print(' '.join(matches))

This will give you:

word
test
word test
JamoBox
  • 764
  • 9
  • 23
  • 1
    I suppose you could also convert it to a list-comp such as `result = [res for res in ([w for w in ss if w in words] for words in strings) if res]` if really wanted... – Jon Clements May 21 '17 at 10:41