I have a line of strings:
"specificationsinaccordancewithqualityaccreditedstandards"
Which needs to be split into tokenized words such as:
"specifications in accordance with quality accredited standards"
I have tried nltk
's word_tokenize
but it was not able to convert,
Context: I am parsing a PDF document into text file, and this is the text which I am getting back from the pdf converter, to convert pdf into text I am using PDFminer in Python