Using a general-purpose programming language like Java, what is the most efficient way to search through a ~20 page document to replace a set of 5000+ strings with some predetermined replacement string? The program should not replace any strings that have already been replaced. What data structure would be optimal to store the 5000+ strings and each of their replacements - two arrays, a dictionary, or something else?
Here are some of the options that I have considered so far:
Iterate through the entire .txt document once time per string using string.replace. The problem is that the algorithm must iterate through the entire .txt document an extra time for each string stored.
Iterate through the .txt once while replacing string as necessary while creating a new string by appending replacements. This seems more efficient, but each step would still require checking the entire set of 5000+ strings for any strings to replace.
Is there a more optimized means of solving this problem, or is one of the above attempts already optimal?
Also, would it be possible to run this algorithm more efficiently in a lower-level language like C?