I have some large text files which are in correct English because extracted from pdfs. However, many words in these text files are joined: "informationotherwise", "havebeen", "reportthatexplains". Every spell checker will spot these errors, e.g. LanguageTool, Sublime, MS-Word. However, Python struggles.
I tried pyspellchecker and TextBlob to check and correct these words, but, alas, to no avail.
See for example this code, which returns None three times.
misspelled = spell.unknown(["informationotherwise", "havebeen", "reportthatexplains"])
for word in misspelled:
print(spell.correction(word))
print(spell.candidates(word))
And this code:
t ="havebeen"
TextBlob(t).correct().string
>>> 'havebeen'
Any suggestions?