1

What is the absolute fastest way to read and write strings from a file with Java?

I need to read a file of known format into a String[] — where each line is one item in the array — and then back to the file.

The reading, in particular, must be as fast as possible.

Is there a better way then just using a BufferedReader and reading line by line into an array?

Pops
  • 30,199
  • 37
  • 136
  • 151
Dan
  • 9,681
  • 14
  • 55
  • 70
  • 2
    http://stackoverflow.com/questions/326390/how-to-create-a-java-string-from-the-contents-of-a-file – OscarRyz Apr 04 '11 at 20:55
  • 2
    Reading and writing lines is not serialization. I'll correct the title. – Tom Anderson Apr 04 '11 at 20:56
  • @Tom, good idea, but I feel like we might as well go all the way and fix the body, too. @Dan, we're editing because "serialization" has a specific meaning in Java which doesn't match up with the way you were using it. – Pops Apr 04 '11 at 21:00
  • How are the characters encoded in the file? – seh Apr 04 '11 at 21:14

4 Answers4

3

Consider using Google protobuf.

Amir Afghani
  • 37,814
  • 16
  • 84
  • 124
1

Just a crazy idea: you could write the length of each string in the file. Something like:

BufferedInputStream stream=new BufferedInputStream(new FileInputStream("file.bin"));
byte[] buff=new byte[256];
String[] result=new String[10];
for(int i=0;i<10;i++){
    int n=(reader.read()<<8)|reader.read();    // string length (assuming all strings are less than 64K)
    if(buff.length<n) buff=new byte[n];
    reader.read(buff,0,n);
    result[i]=new String(buff,0,n);
}
stream.close();

This will free the BufferedReader from checking every input byte for \n. Though I'm not sure that this will be faster than readLine().

Goblin Alchemist
  • 829
  • 5
  • 11
0

Use NIO and UTF-8 encoders/decoders which take advantage of your string statistics and also take advantage of JIT optmizations. I believe aalto out / in are doing this, and I am sure you can find others.

ThomasRS
  • 8,215
  • 5
  • 33
  • 48
0

Here would be my first pass, assuming that memory is not an issue (ha).

  1. Get the file size as it sits on disk (File.length).
  2. Allocate that size buffer.
  3. Load the whole thing in one shot (InputStream.read(byte[])).
  4. Break that String into substrings entirely in memory.
  5. Do Stuff (tm)
  6. Reverse above to save.

Keep in mind that Java stores character data with UCS-16 internally, which means that your nice ASCII file is going to take x2 the size on disk to account for the "expansion." e.g. You have a 4,124 byte foo.txt file will be at least 8,248 bytes in memory.

Everything else is going to be slower, because the application will be designed to deal with some sort of buffering and wrapping (in particular, to deal with not having enough memory to deal with the file size).

Good luck!

Will Iverson
  • 2,009
  • 12
  • 22