Generally the best method is to figure out where your bottleneck is, and fix it. You'll usually find that you spend 90% of your time in a small portion of your code, and that's where you want to focus your efforts.
Once you've figured out what's taking a lot of time, focus on improving your algorithms. For example, removing duplicates from an ArrayList can be O(n²) complexity if you're using the most obvious algorithm, but that can be reduced to O(n) if you leverage the correct data structures.
Once you've figured out which portions of your code are taking the most time, and you can't figure out how best to fix it, I'd suggest narrowing down your question and posting another question here on StackOverflow.
Edit
As @oxbow_lakes so snidely put it, not all performance bottlenecks are to be found in the code's big-O characteristics. I certainly had no intention to imply that they were. Since the question was about "general methods" for optimizing, I tried to stick to general ideas rather than talking about this specific program. But here's how you can apply my advice to this specific program:
- See where your bottleneck is. There are a number of ways to profile your code, ranging from high-end, expensive profiling software to really hacky. Chances are, any of these methods will indicate that your program spends the 99% of its time waiting for a response from Google.
- Focus on algorithms. Right now your algorithm is (roughly):
- Parse the XML
- Create a list of words
- For each word
- Ping Google for a spell check.
- Return results
Since most of your time is spent in the "ping Google" phase, an obvious way to fix this would be to avoid doing that step more times than necessary. For example:
- Parse the XML
- Create a list of words
- Send list of words to spelling service.
- Parse results from spelling service.
- Return results
Of course, in this case, the biggest speed boost would probably be by using spell checker that runs on the same machine, but that isn't always an option. For example, TinyMCE runs as a javascript program within the browser, and it can't afford to download the entire dictionary as part of the web page. So it packages up all the words into a distinct list and performs a single AJAX request to get a list of those words that aren't in the dictionary.