I am trying to find a string within a long text extracted from a PDF file, and get the string's position in the text, and then return 100 words before the string and 100 after. The problem is that the extraction is not perfect, so I am having a problem like this:
The query string is "test text"
The text may look like:
This is atest textwith a problem
as you can see the word "test" is joined with the letter "a" and the word "text" is joined with the word "with"
So the only function is working with me is __contains __ which doesn't return the position of the word.
Any ideas to find all the occurences of a word in such a text with their postions?
Thank you very much