Based on your comment that you're searching for entire sentences:
Build an index of prefixes.
Sort the file. Next, process your file one time. Compute the length of the prefix needed to reduce a search to, say, 1000 sentences. That is, how many characters of prefix do you need to get within about 1000 sentences of a given sentence.
For example: "The" is probably a common starting word in English. But "The quick" is probably enough to get close, because "q" is low-frequency, to anything like "The quick brown fox ... etc."
One way to do this would be to put all prefixes up to a certain length (say, 40) into a Collections.counter. Find the maximum count at each length, and pick your length so that max is <= 1000. There may be other ways. ;-)
Now, process the file a second time. Build a separate index file, consisting of prefix-length (in the file header), prefixes and offsets. All sentences that start with prefix K begin at offset V. Because the file is sorted, the index will also be sorted.
Your program can read the index into memory, open the file, and start processing searches. For each search, chop off the prefix, look that up in the index, seek to the file offset, and scan for a match.