I’ve indexed the text of PDF’s files on my database, but sometimes the text is not clean and we have spaces between words:
var text = 'C or P ora te go V ernan C e report M ANA g EMENT bO A r D AND s u PE r V is O r y bO A r D C OMM i TTEE s The Management Board has not currently established any committees.';
I want make a front-end search engine for my users, but I need to know the START and END position of each search (Based on the original text, with spaces).
I can do that with a regex, for example if I do:
text.toLowerCase().search(/m ? a ? n ? a ? g ? e ? m ? e ? n ? t/);
I find the word “Management” on start position letter 36. Now, I want know the “End position” of the word (Because I don’t know how much spaces are on the word, so I don’t know how much letters), and I want the search to be multi-matches (To give me the start/end position of multiple results).
Can you help me with that? Again, it’s very important for me to have the start/end position of each words based on the original text, removing spaces then search is not a good solution for me.
I’m also curious to know if I can do that without a regex.
Thank you!