0

I have a pd series that contains

s = pd.Series(['cat, pet','dog, pet','dog','bird', 'bird, pet','tail', 'cat, tail'])

and I want to find all places where s contains both of ['cat', 'pet']

I know that I want to find 'cat' OR 'pet', so I just filter like:

 search = ['cat', 'pet']
 s[s.str.contains('|'.join(search))]

But what if I want to match 'cat' AND 'pet'??

I tried:

s[s.str.contains('&'.join(search))]

But it is not working for me :/

Ralubrusto
  • 1,394
  • 2
  • 11
  • 24
The Dan
  • 1,408
  • 6
  • 16
  • 41
  • you will find what you are looking for here. https://stackoverflow.com/questions/37011734/pandas-dataframe-str-contains-and-operation – LucasG0 Feb 11 '21 at 23:48
  • Please repeat [on topic](https://stackoverflow.com/help/on-topic) and [how to ask](https://stackoverflow.com/help/how-to-ask) from the [intro tour](https://stackoverflow.com/tour). See [How much research?](https://meta.stackoverflow.com/questions/261592/how-much-research-effort-is-expected-of-stack-overflow-users). – Prune Feb 11 '21 at 23:50
  • We have got to be careful here. When searching substrings in a string or conglomeration of strings we use regex and not normal language phrases. regex ( rational expression) is a sequence of characters not words in a search pattern. What you want therefore can be achieved by wrapping the two words in a regex `(character1).*(character2)|(character2).*(character1)`. `.` any character ;`*` zero or n occurrences of the preceding regex In your case `s.str.contains('(cat).*(pet)|(pet).*(cat)')` – wwnde Feb 12 '21 at 00:16
  • I will reformulate the question because df[(df['col_name'].str.contains('apple')) & (df['col_name'].str.contains('banana'))] That is a valid answer, does not actually answer my question, because It does not contemplate the case when len(list_of_strings) is variable or has as size < 100 which is my real case – The Dan Feb 12 '21 at 00:35

0 Answers0