I'm sure someone will tell you to go back and fix the generator of the file. If that's possible, it certainly would be the best thing to do.
It sounds like you're planning to do this more or less by hand - looking for patterns of defects and fixing them up. For that, I'd use Notepad++ - just because I know it, it will handle really big files, and has good search/replace features, including regular expressions. There's a lot of room for improvement, though - in particular, the regular expression language is a bit weak if you're a regexpert.
Anything that tries to understand the XML to do more than chromacoding is likely to be slow when dealing with a file like this.
The XML support in Intellij is shockingly bad, performance-wise, given its overall excellence.