-2

I'm looking for a fast solution which allows me to find predefined phrases (1-5 words) in a (not big) text.

The phrases can be up to 1000. Suppose, the simple find() function is not a good solution.

Could you advise what should I use? Thanks in advance.

Update Why i don't want to use bruit force search:

  • I believe, it is not fast enough.
  • Text can have some inclusions in the phrases. I.e. phrase can be Bank America, but text has bank of America.
  • Phrases can be a little bit changed - apostrophes, -s ending etc.
Pavel Zimogorov
  • 1,387
  • 10
  • 24

1 Answers1

1

I'm not sure about your goal but you can easily find predefined prephrasses in text like that:

predefined_phrases = ["hello", "unicorns with a big mouth!", "Sweet donats"]
isnt_big_text = "A big mouse fly by unicorns with a big mouth! with hello wold."

for phrase in predefined_phrases:
    if phrase in isnt_big_text:
        print("Phrase '%s' found in text" % phrase)
valex
  • 5,163
  • 2
  • 33
  • 40
  • Thanks for this, but I tried to found not bruit force solution and something faster – Pavel Zimogorov Apr 19 '16 at 12:15
  • There was some research here in stackoverflow about search time python operators: http://stackoverflow.com/questions/4901523/whats-a-faster-operation-re-match-search-or-str-find . "in" operator is fastest. – valex Apr 19 '16 at 12:43