41

I made a method that takes a File and a String. It replaces the file with a new file with that string as its contents.

This is what I made:

public static void Save(File file, String textToSave) {

    file.delete();
    try {
        BufferedWriter out = new BufferedWriter(new FileWriter(file));
        out.write(textToSave);
        out.close();
    } catch (IOException e) {
    }
}

However it is painfully slow. It sometimes takes over a minute.

How can I write large files with tens of thousands to maybe up to a million characters in them?

user85421
  • 28,957
  • 10
  • 64
  • 87
  • 9
    Deleting the file is unnecessary. You're overwriting it. – Carles Barrobés Jan 01 '11 at 23:21
  • 1
    How much of the time is CPU time and how much I/O ("system") time? For large files creating the huge `textToSave` string might dominate the time. – Raedwald Jan 01 '11 at 23:34
  • 3
    Not directly related to your question: You might consider restructuring the out.close() statement so that it can be done in a finally block. In case an error is thrown on write, it would still close. – Rocky Madden Jan 02 '11 at 00:36
  • Total random long shot: if you're using XFS under Linux, well, **stop doing that** unless you absolutely know it's what you want/need. – Pointy Jan 02 '11 at 01:12
  • 2
    Don't ignore your IOexception, that can lead to your program failing in mysterious ways – Peter Lawrey Jan 02 '11 at 10:34
  • 5
    Rather than deleting the file before writing, or overriding it directly, I would recommend writing to a temporary file, then renaming it over the old file afterwards. That means you don't risk replacing your old file with something corrupt if the IO fails halfway through. – Tom Anderson Jan 02 '11 at 12:08
  • 1
    How can this question get 21 upvotes when it is clearly wrong? OP even admits that it is wrong - the actual I/O is _not_ causing the long wait. – user949300 Aug 18 '13 at 18:04
  • The code you have posted does not take over over a minute. Your problem lies elsewhere. – user207421 Mar 20 '16 at 22:13

7 Answers7

27

Make sure you allocate a large enough buffer:

BufferedWriter out = new BufferedWriter(new FileWriter(file), 32768);

What sort of OS are you running on? That can make a big difference too. However, taking a minute to write out a file of less-than-enormous size sounds like a system problem. On Linux or other *ix systems, you can use things like strace to see if the JVM is making lots of unnecessary system calls. (A very long time ago, Java I/O was pretty dumb and would make insane numbers of low-level write() system calls if you weren't careful, but when I say "a long time ago" I mean 1998 or so.)

edit — note that the situation of a Java program writing a simple file in a simple way, and yet being really slow, is an inherently odd one. Can you tell if the CPU is heavily loaded while the file is being written? It shouldn't be; there should be almost no CPU load from such a thing.

Pointy
  • 405,095
  • 59
  • 585
  • 614
  • Agreed. He might even be able to know the buffer size needed in advance since he is taking the String as param: textToSave.getBytes().length – Rocky Madden Jan 02 '11 at 00:41
  • @Rocky Madden yea that's a real good point. However dumping a string through the Java IO libraries should be pretty fast almost any way you do it. – Pointy Jan 02 '11 at 01:10
  • getBytes() can be very expensive just to tune a buffer. I suggest you just make it 256K and not worry about it. – Peter Lawrey Jan 02 '11 at 10:33
  • 1
    -1 because if you're writing a single huge string, you don't even need a character buffer - you could pass it to the FileWriter directly, and it would process it in a single batch. It might be worth having a buffer at the byte level (using OutputStreamWriter + BufferedOutputStream + FileOutputStream), because character encoding is done with a buffer whose size you don't control, and which i believe is quite small. But not at the character level. – Tom Anderson Jan 02 '11 at 12:01
  • @Tom, @Pointy - I think the buffer size will have no effect (until it‘s larger than the text being output). Documentation of `BufferedWritter` states: “If the requested length is at least as large as the buffer, however, then this method will flush the buffer and write the characters directly to the underlying stream.” – user85421 Jan 02 '11 at 12:45
  • @Tom, would you agree that regardless of these relatively minor adjustments, that code as presented in the original question should be able to dump out something less than a megabyte or two *much* faster than what's claimed? – Pointy Jan 02 '11 at 13:00
  • @Pointy: yes, absolutely. I just tried running Stuart's code, and i can save a million-character string in less than 20-70 ms (from 100 runs, minimum 23 ms, median 33 ms, 95th centile 41 ms, maximum 64 ms) on a crappy netbook. – Tom Anderson Jan 03 '11 at 16:59
  • 2
    Good answer. It turned out the reason it was writing so slow was actually not because of the writing method, but because I used such a long `String`. It was the computing of the `String` that took so long, and the writing didn't take as much time. My solution was to write the file in pieces, not all at once, so the `String` to write didn't become huge. Using your ideas helped, as well. –  Jan 08 '11 at 22:53
25

A simple test for you

char[] chars = new char[100*1024*1024];
Arrays.fill(chars, 'A');
String text = new String(chars);
long start = System.nanoTime();
BufferedWriter bw = new BufferedWriter(new FileWriter("/tmp/a.txt"));
bw.write(text);
bw.close();
long time = System.nanoTime() - start;
System.out.println("Wrote " + chars.length*1000L/time+" MB/s.");

Prints

Wrote 135 MB/s.
Peter Lawrey
  • 525,659
  • 79
  • 751
  • 1,130
5

You could look into Java's NIO capabilities. It may support what you want to do.

Java NIO FileChannel versus FileOutputstream performance / usefulness

Community
  • 1
  • 1
Wouter Lievens
  • 4,019
  • 5
  • 41
  • 66
3

Try using memory mapped files:

FileChannel rwChannel = new RandomAccessFile("textfile.txt", "rw").getChannel();
ByteBuffer wrBuf = rwChannel.map(FileChannel.MapMode.READ_WRITE, 0, textToSave.length());

wrBuf.put(textToSave.getBytes());

rwChannel.close();
Deepak Agarwal
  • 907
  • 6
  • 21
2

Hi I have created two approaches to create big files, run program on windows 7, 64-bit, 8 GB RAM machine, JDK 8 and below are results.
In both the cases, file of 180 MB created that contains number in each line from 1 to 20 million (2 crore in Indian system).

Java program memory grows gradually till 600 MB

First output

Approach = approach-1 (Using FileWriter)
Completed file writing in milli seconds = 4521 milli seconds.

Second output

Approach = approach-2 (Using FileChannel and ByteBuffer)
Completed file writing in milli seconds = 3590 milli seconds.

One observation - I am calculating position (pos variable) in approach#2, if I comment it out then only last string will be visible due to overwritten at position, but time reduced to nearly 2000 milli seconds.

Attaching code.

import java.io.FileWriter;
import java.io.IOException;
import java.io.RandomAccessFile;
import java.nio.ByteBuffer;
import java.nio.channels.FileChannel;
import java.util.concurrent.TimeUnit;

public class TestLargeFile {

    public static void main(String[] args) {
        writeBigFile();
    }

    private static void writeBigFile() {
        System.out.println("--------writeBigFile-----------");
        long nanoTime = System.nanoTime();
        String fn = "big-file.txt";
        boolean approach1 = false;
        System.out.println("Approach = " + (approach1 ? "approach-1" : "approach-2"));
        int numLines = 20_000_000;
        try {
            if (approach1) {
                //Approach 1 -- for 2 crore lines takes 4.5 seconds with 180 mb file size
                approach1(fn, numLines);
            } else {
                //Approach 2 -- for 2 crore lines takes nearly 2 to 2.5 seconds with 180 mb file size
                approach2(fn, numLines);
            }
        } catch (IOException e) {
            e.printStackTrace();
        }

        System.out.println("Completed file writing in milli seconds = " + TimeUnit.MILLISECONDS.convert((System.nanoTime() - nanoTime), TimeUnit.NANOSECONDS));
    }

    private static void approach2(String fn, int numLines) throws IOException {
        StringBuilder sb = new StringBuilder();
        FileChannel rwChannel = new RandomAccessFile(fn, "rw").getChannel();
        ByteBuffer wrBuf;

        int pos = 0;
        for (int i = 1; i <= numLines; i++) {
            sb.append(i).append(System.lineSeparator());
            if (i % 100000 == 0) {
                wrBuf = rwChannel.map(FileChannel.MapMode.READ_WRITE, pos, sb.length());
                pos += sb.length();
                wrBuf.put(sb.toString().getBytes());
                sb = new StringBuilder();
            }
        }
        if (sb.length() > 0) {
            wrBuf = rwChannel.map(FileChannel.MapMode.READ_WRITE, pos, sb.length());
            wrBuf.put(sb.toString().getBytes());
        }
        rwChannel.close();
    }

    private static void approach1(String fn, int numLines) throws IOException {
        StringBuilder sb = new StringBuilder();
        for (int i = 1; i <= numLines; i++) {
            sb.append(i).append(System.lineSeparator());
        }
        FileWriter fileWriter = new FileWriter(fn);
        fileWriter.write(sb.toString());
        fileWriter.flush();
        fileWriter.close();
    }
}
shaILU
  • 2,050
  • 3
  • 21
  • 40
0

This solution creates 20GB file containing string "ABCD...89\n" for 10 * 200 million times using Java NIO. Write performance on MacBook Pro (14-inch from 2021, M1 Pro, SSD AP1024R) is around 5.1 GB/s.

Code is following:

public static void main(String[] args) throws IOException {
    long number_of_lines = 1024 * 1024 * 200;
    int repeats = 10;
    byte[] buffer = "ABCD...89\n".getBytes();
    FileChannel rwChannel = FileChannel.open(Path.of("textfile.txt"), StandardOpenOption.CREATE, StandardOpenOption.WRITE);

    // prepare buffer
    ByteBuffer wrBuf = ByteBuffer.allocate(buffer.length * (int) number_of_lines);
    for (int i = 0; i < number_of_lines; i++)
        wrBuf.put(buffer);

    long t1 = System.currentTimeMillis();

    for(int i = 0; i < repeats; i++)  {
        rwChannel.write(wrBuf);
        wrBuf.flip();
    }

    while (wrBuf.hasRemaining()) {
        rwChannel.write(wrBuf);
    }

    long t2 = System.currentTimeMillis();

    System.out.println("Time: " + (t2-t1));
    System.out.println("Speed: " + ((double) number_of_lines * buffer.length*10 / (1024*1024)) / ((t2-t1) / (double) 1000) + " Mb/s");
}
-3

In Java, the BufferWriter is very slow: Use the native methods directly, and call them as little as possible (give them as much data per call as you can).

    try{
        FileOutputStream file=new FileOutputStream(file);
        file.write(content);
        file.close();
    }catch(Throwable e){
        D.error(e);
    }//try

Also, deleting the file can take a while (maybe it is being copied to the recycle bin first). Just overwrite the file, like in the above code.

Kyle Lahnakoski
  • 924
  • 10
  • 16
  • 1
    I have not had experience with BufferedWriter being "very slow" at all, and I've been writing server-side Java code for a really long time. I don't think it's what I'd use if I had some very serious mega-throughput application, maybe, but it's not that bad; how could it be? – Pointy Jan 02 '11 at 01:11
  • 2
    likewise, I have never seen a call to File#delete() move a file to a recycle bin. Delete means delete. – Kevin Day Jan 02 '11 at 04:07
  • Pointy: Yes, it probably was "a long time ago" that I traced the Java file writes through the MS debugger to see the inane number of system calls it was making on my machine. – Kyle Lahnakoski Feb 27 '13 at 03:47