1

I am trying to automatically detect words from a string in python

Before:

["Absoluteadvantage", "Absorptioncosting", "Accreditedinvestor"]

After:

["Absolute Advantage", "Absorption Costing", "Accredited Investor"]

I understand that the accuracy will never be perfect but am looking for a method to seperate these strings of text into seperate words. I've tried using nltk's word_tokenize method to try and seperate these to no avail.

Santa
  • 158
  • 1
  • 9
  • The marked duplicate has an awesome answer that explains how to cut a whole text into words. But here is this question it's not a whole text, it's just a pair of words. Which is a lot easier and doesn't require such sophisticated tools. – Stef Feb 06 '23 at 10:18
  • If `english_words` is a python `set` containing all the words in the english language, then to cut a string `s` into a pair of words, just find `i` such that `s[:i] in english_words and s[i:] in english_words`. – Stef Feb 06 '23 at 10:20

0 Answers0