My aim is to build an inverted index file in perl: I have file(s) with 10 Million+ lines in the form:
document id: citing document 1; citing document 2;
Example:
document 56: document 12, document 45
document 117: document 12, document 22, document 99
and I want to create another file in the form:
document 12: document 117, document 56
...
Currently I am reading the source file(s) line by line, and appending the index file (one line for each document) for each citation. But appending the index file ( In Perl, how do I change, delete, or insert a line in a file, or append to the beginning of a file?) for each citation is very slow. Any alternative/more efficient approach? Thanks.