I am writing a script that introduces misspellings into sentence. I am using python re module to replace the original word with the misspelling. The script looks like this:
# replacing original word by error
pattern = re.compile(r'%s' % original_word)
replace_by = r'\1' + err
modified_sentence = re.sub(pattern, replace_by, sentence, count=1)
But the problem is this will replace even if original_word was part of another word for example:
If i had
original_word = 'in'
err = 'il'
sentence = 'eating food in'
it would replace the occurrence of 'in' in eating like:
> 'eatilg food in'
I was checking in the re documentation but it doesn't give any example on how to include regex options, for example:
If my pattern is:
regex_pattern = '\b%s\b' % original_word
this would solve the problem as \b represents 'word boundary'. But it doesn't seem to work.
I tried to find to find a work around it by doing:
pattern = re.compile(r'([^\w])%s' % original_word)
but that does not work. For example :
original_word = 'to'
err = 'vo'
sentence = 'I will go tomorrow to the'
it replaces it to:
> I will go vomorrow to the
Thank you, any help appreciated