0

Suppose I have a list of lists.

List1=[["Red is my favorite color."],["Blue is her favorite."], ["She is really nice."]]

Now I want to check if the word 'is' exists after a certain set of words.

I made a word lise word_list=['Red', 'Blue']

Is there a way to check that using if statement?

If I write if 'is' in sentences: It will return all three sentences in List1, I want it to return first two sentences.

Is there a way to check if the word 'is' is positioned exactly after the words in word_list? Thank you in advance.

halfer
  • 19,824
  • 17
  • 99
  • 186

3 Answers3

3

NB. I assumed a match in the start of the string. For a match anywhere use re.search instead of re.match.

You can use a regex:

import re

regex = re.compile(fr'\b({"|".join(map(re.escape, word_list))})\s+is\b')
# regex: \b(Red|Blue)\s+is\b

out = [[bool(regex.match(x)) for x in l]
       for l in List1]

Output: [[True], [True], [False]]

Used input:

List1 = [['Red is my favorite color.'],
         ['Blue is her favorite.'],
         ['She is really nice.']]

word_list = ['Red', 'Blue']

If you want the sentences:

out = [[x for x in l if regex.match(x)]
       for l in List1]

Output:

[['Red is my favorite color.'],
 ['Blue is her favorite.'],
 []]

Or as flat list:

out = [x for l in List1 for x in l if regex.match(x)]

Output:

['Red is my favorite color.',
 'Blue is her favorite.']
mozway
  • 194,879
  • 13
  • 39
  • 75
  • @Codingamethyst note that I assumed a match in the start of the string. For a match anywhere use `re.search` instead of `re.match` – mozway Sep 02 '22 at 00:49
  • The regex is working at the string level and is fully unaware of the container. I'm not sure which of my alternatives you are referring to but please update your question with a clear example with the expected output. – mozway Sep 02 '22 at 01:05
  • Your answer is working perfectly fine, I was wondering if there is a way to remove the [] empty lists as I am getting 4000+ empty lists out of 5000 datasets. I have removed it using filter, I was wondering if it was possible to do it withing the regex itself. – Codingamethyst Sep 02 '22 at 01:08
  • No it's not. As said above the regex has no knowledge of the lists, only of the strings contained in the lists. You need to filter with post-processing. – mozway Sep 02 '22 at 01:11
  • Another silly question, I'm sorry, but is it possible to check more than one word list. Instead of word_list, if I had two or three lists, would it be possible to use something like or operator to check all the lists? I can always merge and make 1 list but I was wondering if it was possible to do in regex. I'm sorry again for the stupid question – Codingamethyst Sep 02 '22 at 01:22
  • Yes, you can craft your regex using several lists as input. This will not be doing it "*in regex*" *per se*. Don't see a regex as some kind of magic do-all tool, the regex *only* takes string as input a performs a computation on it. Here I used python to craft the regex, then used it to perform the match. – mozway Sep 02 '22 at 01:26
  • In my code the regex is `\b(Red|Blue)\s+is\b` – mozway Sep 02 '22 at 01:29
2

You could try this:

List1 = [['Red is my favorite color.'],['Blue is her favorite.'], ['She is really nice.']]
listResult = []
word_list = ['Red', 'Blue']
for phrase in List1:
    for word in word_list:
        if f'{word} is' in phrase[0]:
            listResult.append(phrase[0])
PepeChuy
  • 84
  • 3
0

Already answered.

See re module documentation: https://docs.python.org/3/library/re.html

Stack overflow previously answered question: Check if string matches pattern

ben
  • 75
  • 10
  • 2. Actually I sentenced tokenzied list of data, that's how I ended up with list of lists containing single strings. – Codingamethyst Sep 02 '22 at 00:42
  • Your answer could be improved with additional supporting information. Please [edit] to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers [in the help center](/help/how-to-answer). – Community Sep 06 '22 at 22:33