Edit_2 : **Question has changed and is a bit more tricky. I let this answer to the last problem, but it is not the actual one
CURRENT PROBLEM
Inn ovative b usines s m odels and financi ng m e chanisms for pv de ploym ent in em ergi ng regio ns
I am advising you use some real word dictionnary. This is a SO thread.
You would, then, take your sentence (here Inn ovative b usines s m odels and financi ng m e chanisms for pv de ploym ent in em ergi ng regio ns
), and split
it using spaces (seemingly, you only have this character in common).
Here is the pseudo-code solution :
iterating through the string list:
keeping the currWord index
while realWord not found:
checking currWord in dictionnary.
if realWord is not found:
join the nextWord to the currWord
else:
join currWord to the final sentence
Doing this, and keeping the currWord index you're at, you can log
where you have a problem and add some new rules for your word splitting. You might know you have a problem if a certain threshold is reached (for instance : word 30 characters long ?).
LAST PROBLEM
Edit : You're right @Adelin, I should have commented.
If I may, a simpler program where you understand what's going on and/or if you dislike the use of regex for simple uniform cases:
def raw_char_to_sentence(seq):
""" Splits the "seq" parameter using 'space'. As words are separated with two spaces,
"raw_char_to_sentence" transforms this list of characters into a full string
sentence.
"""
char_list = seq.split(' ')
sentence = ''
word = ''
for c in char_list:
# Adding single character to current word.
word += c
if c == '':
# If word is over, add it to sentence, and reset the current word.
sentence += (word + ' ')
word = ''
# This function adds a space at the end, so we need to strip it.
return sentence.rstrip()
temp = "H o w d o s m a l l h o l d e r f a r m e r s f i t i n t o t h e b i g p i c t u r e o f w o r l d f o o d p r o d u c t i o n"
print raw_char_to_sentence(temp)
# outputs : How do smallholder farmersfit into the big picture of world