1

I have a pandas Dataframe, where one column contains lists. I want to search every list (=every row) and check if one or more elements contain specific substrings.

Data:

list_Series = pd.Series([["handful of tomatos", "2 peppers", " tsp salt"],
                        ["1 kg of meat", "fresh basil"]])

Search words:

search_for = ["pepper", "salt"]

Desired output for 'list_Series':

True
False

Now I want to apply a (maybe vectorized?) function that checks if a series element contains all the search substrings. If the Series only contains strings and no lists, I would do: pd.Series.str.contains("salt"). When looking at a single list I would perform:

def filterlist(liste, searchwords):
    occurs = 0
    for word in searchwords:
        for string in liste:
            if word.lower() in string.lower():
                occurs += 1
                break 
        if occurs == len(searchwords):                   
            return True

But this is very clunky and long. And I guess not very efficient when applying to a whole pd.Series. And I don't know how to apply it to a Series.

Thanks for the help! Also looking for feedback, this is my first post on stackoverflow! Also would it be better to convert this series into a dataframe?

zabop
  • 6,750
  • 3
  • 39
  • 84

1 Answers1

0

You can use nested list comprehensions:

result = [listelement for searchtarget in search_for for each_list_in_series in list_Series for listelement in each_list_in_series if searchtarget in listelement]

result will be:

['2 peppers', ' tsp salt']

This is equivalent to, without list comprehensions:

result=[]
for searchtarget in search_for:
    for each_list_in_series in list_Series:
        for listelement in each_list_in_series:
            if searchtarget in listelement:
                result.append(listelement)

A nice visual aide for nested list comprehensions, from Rahul's answer to this question:

enter image description here

zabop
  • 6,750
  • 3
  • 39
  • 84