I have a very large file, over 100GB (many billions of lines), and I would like to conduct a two-level sort as quick as possible on a unix system with limited memory. This will be one step in a large perl script, so I'd like to use perl if possible.
So, how can I do this? My data looks like this:
A 129
B 192
A 388
D 148
D 911
A 117
...But for billions of lines. I need to first sort by letter, and then by number. Would it be easier to use a unix sort, like...
sort -k1,2 myfile
Or can I do this all in perl somehow? My system will have something like 16GB ram, but the file is about 100GB.
Thanks for any suggestions!