Check if a word is in a string in Python

Question

I'm working with Python, and I'm trying to find out if you can tell if a word is in a string.

I have found some information about identifying if the word is in the string - using .find, but is there a way to do an if statement. I would like to have something like the following:

if string.find(word):
    print("success")

score 447 · Accepted Answer · edited Sep 23 '21 at 05:51

447

What is wrong with:

if word in mystring: 
   print('success')

edited Sep 23 '21 at 05:51

Martin Thoma

124,992
159
614
958

answered Mar 16 '11 at 01:13

fabrizioM

46,639
15
102
119

147

just as a caution, if you have a string "paratyphoid is bad" and you do a if "typhoid" in "paratyphoid is bad" you will get a true. – David Nelson Dec 19 '12 at 17:52
5

Anyone knows how to overcome this problem? – user2567857 Aug 19 '14 at 09:36
7

@user2567857, regular expressions -- see Hugh Bothwell's answer. – Mark Rajcok Aug 21 '14 at 19:23
@fabrizioM what can I do if I want to check if two words are in my string? – Loretta Jul 27 '15 at 10:57
4

if (word1 in mystring and word2 in mystring) – louie mcconnell Jul 11 '16 at 07:03
14

How is this the accepted answer?!! It just checks whether a sequence of characters (not a word) appear in a string – pedram bashiri Nov 19 '19 at 22:51
1

This really shouldn't be the accepted answer as it will test wrong for a number of cases. – bjornasm Mar 20 '20 at 12:57
what happens if the string is contained more than one? – Leos313 Apr 04 '20 at 19:28
1

This is not useful when you want to find the exact word in the sentence – smrf Jun 10 '20 at 07:22
@pedrambashiri a word is a sequence of characters... the question is vague and as a result the answer is vague, there is nothing wrong with this being the accepted answer. Sure people might want to find if a word is in a sentence at which point this would fail, but the person asking the question didn't clarify their requirements – Kevin May 18 '21 at 21:07
@Kevin, a sequence of characters is not a word. Yes, the question is vague. So, try to close it. Not close the mine! I want to help to code. Not to make points... – marcio May 21 '21 at 19:29
Is using "in" case sensitive? – yoyo Jan 05 '22 at 20:38
I've added a function to this thread that solves all the problems of the word being at the beginning, end, case-sensitivity or next to punctuation. – iStuart Mar 18 '22 at 09:49

score 213 · Answer 2 · answered Mar 16 '11 at 01:52

213

if 'seek' in 'those who seek shall find':
    print('Success!')

but keep in mind that this matches a sequence of characters, not necessarily a whole word - for example, 'word' in 'swordsmith' is True. If you only want to match whole words, you ought to use regular expressions:

import re

def findWholeWord(w):
    return re.compile(r'\b({0})\b'.format(w), flags=re.IGNORECASE).search

findWholeWord('seek')('those who seek shall find')    # -> <match object>
findWholeWord('word')('swordsmith')                   # -> None

answered Mar 16 '11 at 01:52

Hugh Bothwell

55,315
8
84
99

9

Is there a really fast method of searching for multiple words, say a set of several thousand words, without having to construct a for loop going through each word? I have a million sentences, and a million terms to search through to see which sentence has which matching words. Currently it's taking me days to process, and I want to know if there's a faster way. – Tom Dec 27 '16 at 19:49
@Tom try to use grep instead of python regex – El Ruso Feb 03 '17 at 22:57
p1 for swordsmith – Robino Aug 11 '17 at 16:23
How do you handle exceptions, e.g. when the word is not found in the string? – FaCoffee May 04 '18 at 10:58
1

@FaCoffee: if the string is not found, the function returns None (see last example above). – Hugh Bothwell May 06 '18 at 17:15
To be on the safe side of things, you should do `.format(re.escape(w))`. If you don't have that you open yourself up to string manipulation attacks. Of course, if you can trust your input, this is a non issue. However, if your list of words comes from another source (list found on the internet, database, user input), this is super critical. – Luis Nell Feb 26 '20 at 12:07

user200783 · Answer 3 · 2018-03-20T16:34:20.923

67

If you want to find out whether a whole word is in a space-separated list of words, simply use:

def contains_word(s, w):
    return (' ' + w + ' ') in (' ' + s + ' ')

contains_word('the quick brown fox', 'brown')  # True
contains_word('the quick brown fox', 'row')    # False

This elegant method is also the fastest. Compared to Hugh Bothwell's and daSong's approaches:

>python -m timeit -s "def contains_word(s, w): return (' ' + w + ' ') in (' ' + s + ' ')" "contains_word('the quick brown fox', 'brown')"
1000000 loops, best of 3: 0.351 usec per loop

>python -m timeit -s "import re" -s "def contains_word(s, w): return re.compile(r'\b({0})\b'.format(w), flags=re.IGNORECASE).search(s)" "contains_word('the quick brown fox', 'brown')"
100000 loops, best of 3: 2.38 usec per loop

>python -m timeit -s "def contains_word(s, w): return s.startswith(w + ' ') or s.endswith(' ' + w) or s.find(' ' + w + ' ') != -1" "contains_word('the quick brown fox', 'brown')"
1000000 loops, best of 3: 1.13 usec per loop

Edit: A slight variant on this idea for Python 3.6+, equally fast:

def contains_word(s, w):
    return f' {w} ' in f' {s} '

edited Mar 20 '18 at 16:34

answered Apr 11 '16 at 20:32

user200783

13,722
12
69
135

18

This has several problems: (1) Words at the end (2) Words at the beginning (3) words in between like `contains_word("says", "Simon says: Don't use this answer")` – Martin Thoma Aug 09 '17 at 09:53
1

@MartinThoma - As stated, this method is specifically for finding out "whether a whole word is in a space-separated list of words". In that situation, it works fine for: (1) Words at the end (2) Words at the beginning (3) words in between. Your example only fails because your list of words includes a colon. – user200783 Aug 09 '17 at 13:41
Clever thinking. Thanks! :) – Ziemo Sep 17 '19 at 13:53
This has a few problems. It assumes space is the only thing that breaks one word from another. Try finding fox on "the quick brown fox!" or "the quick brown dog, fox, and chicken. The regex answer does not have this issue, that I can see. Though, tokenization is a hard problem and for best results use SPACY or NLTK. – JeffHeaton Oct 14 '19 at 17:53
2

@JeffHeaton Once again, this method is SPECIFICALLY for "If you want to find out whether a whole word is in a space-separated list of words", as the author clearly stated. – bitwitch Feb 17 '20 at 20:58
```def wordSearch(word, phrase): return word in [words.strip(',.?!') for words in phrase.split()]``` [link](https://stackoverflow.com/questions/65458683/python-check-if-a-word-is-in-a-string) – marcio Dec 27 '20 at 20:48

score 22 · Answer 4 · edited Sep 23 '21 at 05:49

22

You can split string to the words and check the result list.

if word in string.split():
    print("success")

edited Sep 23 '21 at 05:49

Martin Thoma

124,992
159
614
958

answered Dec 01 '16 at 18:26

Corvax

782
8
13

4

Please use the [edit] link explain how this code works and don’t just give the code, as an explanation is more likely to help future readers. – Jed Fox Dec 01 '16 at 19:55
2

This should be the actual answer for matching the whole word. – Kaushik NP Jun 16 '17 at 19:52
We should think about punctuation too. Look [here](https://stackoverflow.com/questions/65458683/python-check-if-a-word-is-in-a-string). – marcio Dec 27 '20 at 20:51

score 22 · Answer 5 · edited Sep 23 '21 at 05:50

22

find returns an integer representing the index of where the search item was found. If it isn't found, it returns -1.

haystack = 'asdf'

haystack.find('a') # result: 0
haystack.find('s') # result: 1
haystack.find('g') # result: -1

if haystack.find(needle) >= 0:
  print('Needle found.')
else:
  print('Needle not found.')

edited Sep 23 '21 at 05:50

Martin Thoma

124,992
159
614
958

answered Mar 16 '11 at 01:13

Matt Howell

15,750
7
49
56

score 12 · Answer 6 · edited Apr 08 '16 at 09:23

This small function compares all search words in given text. If all search words are found in text, returns length of search, or False otherwise.

Also supports unicode string search.

def find_words(text, search):
    """Find exact words"""
    dText   = text.split()
    dSearch = search.split()

    found_word = 0

    for text_word in dText:
        for search_word in dSearch:
            if search_word == text_word:
                found_word += 1

    if found_word == len(dSearch):
        return lenSearch
    else:
        return False

usage:

find_words('çelik güray ankara', 'güray ankara')

score 9 · Answer 7 · edited Aug 11 '16 at 13:15

9

If matching a sequence of characters is not sufficient and you need to match whole words, here is a simple function that gets the job done. It basically appends spaces where necessary and searches for that in the string:

def smart_find(haystack, needle):
    if haystack.startswith(needle+" "):
        return True
    if haystack.endswith(" "+needle):
        return True
    if haystack.find(" "+needle+" ") != -1:
        return True
    return False

This assumes that commas and other punctuations have already been stripped out.

edited Aug 11 '16 at 13:15

IanS

15,771
9
60
84

answered Jun 15 '12 at 07:23

daSong

407
1
5
9

This solution worked best for my case as I am using tokenized space separated strings. – Avijit Jan 04 '16 at 05:05

score 8 · Answer 8 · edited Sep 23 '21 at 05:51

8

Using regex is a solution, but it is too complicated for that case.

You can simply split text into list of words. Use split(separator, num) method for that. It returns a list of all the words in the string, using separator as the separator. If separator is unspecified it splits on all whitespace (optionally you can limit the number of splits to num).

list_of_words = mystring.split()
if word in list_of_words:
    print('success')

This will not work for string with commas etc. For example:

mystring = "One,two and three"
# will split into ["One,two", "and", "three"]

If you also want to split on all commas etc. use separator argument like this:

# whitespace_chars = " \t\n\r\f" - space, tab, newline, return, formfeed
list_of_words = mystring.split( \t\n\r\f,.;!?'\"()")
if word in list_of_words:
    print('success')

edited Sep 23 '21 at 05:51

Martin Thoma

124,992
159
614
958

answered Dec 18 '17 at 11:44

tstempko

1,176
1
15
17

1

This is a good solution, and similar to @Corvax, with the benefit of adding common characters to split on so that in a string like "First: there..", the word "First" could be found. Note that @tstempko isn't including ":" in the additional chars. I would :). Also, if the search is case-insensitive, consider using .lower() on both the word and string before the split. `mystring.lower().split()` and `word.lower()` I think this is also faster than the regex example. – beauk Dec 16 '19 at 14:59
I think to use something like ```split( \t\n\r\f,.;!?'\"()")``` we need to ```import re```. But it is a good solution too. – marcio Dec 27 '20 at 21:05

Martin Thoma · Answer 9 · 2017-08-09T10:20:48.360

As you are asking for a word and not for a string, I would like to present a solution which is not sensitive to prefixes / suffixes and ignores case:

#!/usr/bin/env python

import re


def is_word_in_text(word, text):
    """
    Check if a word is in a text.

    Parameters
    ----------
    word : str
    text : str

    Returns
    -------
    bool : True if word is in text, otherwise False.

    Examples
    --------
    >>> is_word_in_text("Python", "python is awesome.")
    True

    >>> is_word_in_text("Python", "camelCase is pythonic.")
    False

    >>> is_word_in_text("Python", "At the end is Python")
    True
    """
    pattern = r'(^|[^\w]){}([^\w]|$)'.format(word)
    pattern = re.compile(pattern, re.IGNORECASE)
    matches = re.search(pattern, text)
    return bool(matches)


if __name__ == '__main__':
    import doctest
    doctest.testmod()

If your words might contain regex special chars (such as +), then you need re.escape(word)

score 5 · Answer 10 · edited Sep 23 '21 at 05:50

5

Advanced way to check the exact word, that we need to find in a long string:

import re
text = "This text was of edited by Rock"
#try this string also
#text = "This text was officially edited by Rock" 
for m in re.finditer(r"\bof\b", text):
    if m.group(0):
        print("Present")
    else:
        print("Absent")

edited Sep 23 '21 at 05:50

Martin Thoma

124,992
159
614
958

answered Nov 02 '16 at 08:39

Rameez

564
5
11

marcio · Answer 11 · 2021-04-24T02:03:03.307

What about to split the string and strip words punctuation?

w in [ws.strip(',.?!') for ws in p.split()]

If need, do attention to lower/upper case:

w.lower() in [ws.strip(',.?!') for ws in p.lower().split()]

Maybe that way:

def wcheck(word, phrase):
    # Attention about punctuation and about split characters
    punctuation = ',.?!'
    return word.lower() in [words.strip(punctuation) for words in phrase.lower().split()]

Sample:

print(wcheck('CAr', 'I own a caR.'))

I didn't check performance...

score 1 · Answer 12 · edited Sep 23 '21 at 05:51

1

You could just add a space before and after "word".

x = raw_input("Type your word: ")
if " word " in x:
    print("Yes")
elif " word " not in x:
    print("Nope")

This way it looks for the space before and after "word".

>>> Type your word: Swordsmith
>>> Nope
>>> Type your word:  word 
>>> Yes

edited Sep 23 '21 at 05:51

Martin Thoma

124,992
159
614
958

answered Feb 26 '15 at 14:23

PyGuy

49
3

5

But what if the word is at the beginning or the end of the sentence (no space) – MikeL Dec 13 '16 at 11:25

score 0 · Answer 13 · edited Sep 23 '21 at 05:51

0

I believe this answer is closer to what was initially asked: Find substring in string but only if whole words?

It is using a simple regex:

import re

if re.search(r"\b" + re.escape(word) + r"\b", string):
  print('success')

edited Sep 23 '21 at 05:51

Martin Thoma

124,992
159
614
958

answered Aug 25 '21 at 13:25

Milos Cuculovic

19,631
51
159
265

iStuart · Answer 14 · 2022-03-18T09:48:03.730

One of the solutions is to put a space at the beginning and end of the test word. This fails if the word is at the beginning or end of a sentence or is next to any punctuation. My solution is to write a function that replaces any punctuation in the test string with spaces, and add a space to the beginning and end or the test string and test word, then return the number of occurrences. This is a simple solution that removes the need for any complex regex expression.

def countWords(word, sentence):
    testWord = ' ' + word.lower() + ' '
    testSentence = ' '

    for char in sentence:
        if char.isalpha():
            testSentence = testSentence + char.lower()
        else:
            testSentence = testSentence + ' '

    testSentence = testSentence + ' '

    return testSentence.count(testWord)

To count the number of occurrences of a word in a string:

sentence = "A Frenchman ate an apple"
print(countWords('a', sentence))

returns 1

sentence = "Is Oporto a 'port' in Portugal?"
print(countWords('port', sentence))

returns 1

Use the function in an 'if' to test if the word exists in a string

Check if a word is in a string in Python

14 Answers14

Linked

Related