How to search a whole word string stored in a data frame within a list in python?

Question

I have a pandas data frame which has a column 'title' and I have a list called my_list. I want to search each value from df['title'] in my_list and get the list index. df['title'] needs to be a whole word and not part of any word.

Regex seemed to be the obvious solution. I am storing one value at a time from df['title'] to i to match, concatenating '\b' before and after the string to get the whole word only, I am getting no match even though the word exists.

for i in df['title']:
    print(i)
    x = ('\b'+i+'\b')
    print(x)
    print([s for s in my_list if re.search(x, s)])
    print(len(i))
    print(len(x))

The output that I get is

ITEM
ITE
[]
4
6
ITEM_NUMBER_TYPE
ITEM_NUMBER_TYP
[]
16
18
.........and so on

I am unable to figure out why ITEM becomes ITE and when I try

x = ('\b'+i+' \b')
#added a white space before the second '\b' to get the whole word

I get the following output:

ITEM
ITEM
[]
4
7
ITEM_NUMBER_TYPE
ITEM_NUMBER_TYPE
[]
16
19
......and so on

content of my_list:

print(my_list)
Output:
['ITEM_XFORM_IND IS NOT NULL',
 'ITEM IS NOT NULL',
 'ITEM_LEVEL IS NOT NULL']

#I want for i = 'ITEM', just 'ITEM IS NOT NULL' to match, not the rest.

so my list is a list of words? – Dani Mesejo Oct 13 '19 at 15:55 — Dani Mesejo, Oct 13 '19 at 15:55
@DanielMesejo yaa, each element is string (sentence). – Syed Afsahul Oct 13 '19 at 16:07 — Syed Afsahul, Oct 13 '19 at 16:07

score -1 · Answer 1 · edited Oct 13 '19 at 16:21

-1

Python interprets \b as backspace, try r'\b'

edited Oct 13 '19 at 16:21

RobC

22,977
20
73
80

answered Oct 13 '19 at 16:11

Pete

79
7

How to search a whole word string stored in a data frame within a list in python?

1 Answers1