get sentence from list of sentences with exact word match : Python

Question

Let's say I have a list of sentences:

sent = ["Chocolate is loved by all.", 
        "Brazil is the biggest exporter of coffee.", 
        "Tokyo is the capital of Japan.",
        "chocolate is made from cocoa."]

I want to return all sentences that have the exact full word "chocolate", i.e. ["Chocolate is loved by all.", "chocolate is made from cocoa."]. If any sentence does not have the word "chocolate", it shouldn't be returned. The word "chocolateyyy" should not be returned either.

How can I do this in Python?

What have you tried doing and where is the problem? – UnholySheep Sep 28 '18 at 11:10 — UnholySheep, Sep 28 '18 at 11:10

DeltaMarine101 · Answer 1 · 2018-09-28T11:24:58.033

This will make sure that the search word is actually a full word, rather than a sub-word like 'chocolateyyy'. It's also not case sensitive, so 'Chocolate' = 'chocolate' despite the first letters being capitalised differently.

sent = ["Chocolate is loved by all.", "Brazil is the biggest exporter of coffee.",
        "Tokyo is the capital of Japan.","chocolate is made from cocoa.", "Chocolateyyy"]

search = "chocolate"

print([i for i in sent if search in i.lower().split()])

Here's a more expanded version for clarity with an explanation:

result = []
for i in sent: # Go through each string in sent
    lower = i.lower() # Make the string all lowercase
    split = lower.split(' ') # split the string on ' ', or spaces
                     # The default split() splits on whitespace anyway though
    if search in split: # if chocolate is an entire element in the split array
        result.append(i) # add it to results
print(result)

I hope this helps :)

I'm glad I could help! Consider accepting one of the answers you received (click the tick ✓ next to your favourite answer) — DeltaMarine101, Sep 28 '18 at 11:31

score 3 · Answer 2 · answered Sep 28 '18 at 11:11

3

You need:

filtered_sent = [i for i in sent if 'chocolate' in i.lower()]

Output

['Chocolate is loved by all.', 'chocolate is made from cocoa.']

answered Sep 28 '18 at 11:11

Sociopath

13,068
19
47
75

score 2 · Answer 3 · answered Sep 28 '18 at 11:18

2

From this question, you want some of the methods in the re library. In particular:

\b Matches the empty string, but only at the beginning or end of a word.

You can therefore search for "chocolate" using re.search(r'\bchocolate\b', your_sentence, re.IGNORECASE).

The rest of the solution is just to iterate through your list of sentences and return a sublist that matches your target string.

answered Sep 28 '18 at 11:18

HenryLockwood

215
1
8

Good one. This one is recommended where search word is more likely to be part of the other words. – Sociopath Sep 28 '18 at 11:25

score 1 · Accepted Answer · answered Sep 28 '18 at 11:47

You can use the regular expression library in python:

import re

sent = ["Chocolate is loved by all.", 
        "Brazil is the biggest exporter of coffee.", 
        "Tokyo is the capital of Japan.",
        "chocolate is made from cocoa."]
match_string = "chocolate"
matched_sent = [s for s in sent if len(re.findall(r"\bchocolate\b", s, re.IGNORECASE)) > 0]
print (matched_sent)

get sentence from list of sentences with exact word match : Python

4 Answers4