1

When I look for a word in a data frame it shows me every entry containing those letters but I really want for it to show me that specific word. Can you help me out?

Here is and example:

import pandas as pd
d = {'col1': ['ROL', 'ROVER','ROL','ROLLER','ROL','TROLLER','rol','rolter','nan'] ,'col2': [1, 2,3,4,5,6,7,9,10]}
df = pd.DataFrame(data=d)     
ROL = df[df['col1'].fillna(0).str.contains("ROL|rol",na=False)] 

The output is something like this

current output image

but what I really wanted was something without those entries

desired output image

Sylhare
  • 5,907
  • 8
  • 64
  • 80
  • Does this answer your question? [Python regular expression match whole word](https://stackoverflow.com/questions/15863066/python-regular-expression-match-whole-word) – AMC Mar 26 '20 at 20:17
  • Please do not share information as images unless absolutely necessary. See: https://meta.stackoverflow.com/questions/303812/discourage-screenshots-of-code-and-or-errors, https://idownvotedbecau.se/imageofcode, https://idownvotedbecau.se/imageofanexception/. – AMC Mar 26 '20 at 20:17
  • sorry. i believe it added value and helped explain my problem – Tiago Emanuel Pratas Mar 27 '20 at 11:40

1 Answers1

3

The problem with your code is that your str.contains("ROL|rol") matches all values apart from ROVER. For example, "ROLLER" also contains "ROL".

Try this use of str.contains:

import re
ids = df.col1.str.contains('rol$|rol-|rol ', flags = re.IGNORECASE, regex = True, na = False)

And then filter:

df[ids]

gives:

Out[115]: 
       col1  col2
0       ROL     1
2   ROL- 33     3
4    ROL -2     5
6  rol nº12     7
broti
  • 1,338
  • 8
  • 29