The way I am going to suggest is to read the file, and keep track of the position. Store the position along the way in a map so you can look it up later.
The first way to do this is to use your file as a DataInput
, and use the RandomAccessFile#readline
RandomAccessFile raf = new RandomAccessFile("filename.txt", "r");
Map<String, Long> index = new HashMap<>();
Now, how is your data stored? If it is stored line by line, and the ecoding conforms to the DataInput
standards, then you can use.
long start = raf.getFilePointer();
String line = raf.readLine();
String key = extractKeyFromLine(line);
index.put(key, start);
Now anytime you need to go back and get the data.
long position = index.get(key);
raf.seek(position);
String line = raf.readLine();
Here is a complete example:
package helloworld;
import java.io.IOException;
import java.io.RandomAccessFile;
import java.util.HashMap;
import java.util.Map;
/**
* Created by matt on 07/02/2017.
*/
public class IndexedFileAccess {
static String getKey(String line){
return line.split(":")[0];
}
public static void main(String[] args) throws IOException {
Map<String, Long> index = new HashMap<>();
RandomAccessFile file = new RandomAccessFile("junk.txt", "r");
//populate index and read file.
String s;
do{
long start = file.getFilePointer();
s = file.readLine();
if(s!=null){
String key = getKey(s);
index.put(key, start);
}
}while(s!=null);
for(String key: index.keySet()){
System.out.printf("key %s has a pos of %s\n", key, index.get(key));
file.seek(index.get(key));
System.out.println(file.readLine());
}
file.close();
}
}
junk.txt
contains:
dog:1, 2, 3
cat:4, 5, 6
zebra: p, z, t
Finally the output is:
key zebra has a pos of 24
zebra: p, z, t
key cat has a pos of 12
cat:4, 5, 6
key dog has a pos of 0
dog:1, 2, 3
There are many caveats to this. For example, if you need a more robust encoding, then the first time you read it you'll want to create a reader that can manage the encoding, and just use your RandomAccessFile
as an input stream. The readLine()
method will fail if the lines are too large. Then you would have to devise your own strategy for extracting the key/data pair.