1

So I've got the following method for downloading files from Amazon S3 and for now it is working but I anticipated that in the future I'll have to deal with considerably large files - 2-3 gigabytes. So what performance optimizations would you recommend? Also links regarding some GENERAL ideas about file I/O in java applicable not only to my case but in general will be much appreciated.

public static void fetchFileFromS3(String filePath, String outPath) {
    int size = 5 * 1024 * 1024; //use 5 megabytes buffers
    byte bufSize[] = new byte[size];  
    FileOutputStream fout = null;
    BufferedOutputStream bufOut = null;
    BufferedInputStream bufIn = null;
    String[] result = getRealPath(filePath);
    S3Object object = Utilities.getS3Instance().getObject(new GetObjectRequest(result[0], result[1]));

    try {
        fout = new FileOutputStream(outPath);
        bufOut = new BufferedOutputStream(fout, size);
        bufIn = new BufferedInputStream(object.getObjectContent(), size);
        int bytesRead = 0;
        while((bytesRead = bufIn.read(bufSize)) != -1) {

            bufOut.write(bufSize, 0, bytesRead);


        }

        System.out.println("Finished downloading file");

        bufOut.flush();
        bufOut.close();
        bufIn.close();

    } catch (IOException ex) {
        Logger.getLogger(Utilities.class.getName()).log(Level.SEVERE, null, ex);
    }
}
Eyal
  • 3,412
  • 1
  • 44
  • 60
LordDoskias
  • 3,121
  • 3
  • 30
  • 44
  • Can you tell, whether download speed or disk speed will be the bottleneck? In many cases, the latter is far less an issue than the first, so there wouldn't be much to do, instead of getting more bandwidth. – user unknown Mar 06 '12 at 21:48
  • Well since the link is 10GE and the disk is an enormous disk array both of those aren't a bottle neck, at least when there is no contention. In this case I was more curious not to introduce bottlenecks in my code . – LordDoskias Mar 06 '12 at 23:55

1 Answers1

0

I think looking into the new-ish Java NIO API's makes sense, even though there's some disagreement about whether they're more efficient in large files.

For example, in the answer to this question using chunked memory-mapping with NIO seems like it might do the trick.

Community
  • 1
  • 1
Eyal
  • 3,412
  • 1
  • 44
  • 60