2

I need to copy big files (GBs) into another file (the container), and I was wondering about performance and ram use.

Reading the entire source file like the following:

RandomAccessFile f = new RandomAccessFile(origin, "r");
originalBytes = new byte[(int) f.length()];
f.readFully(originalBytes);

And later on, copy everything into the container like this:

RandomAccessFile f2 = new RandomAccessFile(dest, "wr");
f2.seek(offset);
f2.write(originalBytes, 0, (int) originalBytes.length);

does everything in memory, correct? So copying big files can have an impact on memory and can result in an OutOfMemory Exception?

Is it better to read the original file bytes by bytes instead of entirely? In that case how should I have to proceed? Thank you in advance.

EDIT:

Following the answer of mehdi maick I finally found the solution: I can use RandomAccessFile as destination as I wanted, and because RandomAccessFile has a method "getChannel" that returns a FileChannel I can pass that to the following method that will do the copy (32KB at time) of the file in the position of the destination I want:

     public static void copyFile(File sourceFile, FileChannel destination, int position) throws IOException {
            FileChannel source = null;
            try {
                source = new FileInputStream(sourceFile).getChannel();
                destination.position(position);
                int currentPosition=0;
                while (currentPosition < sourceFile.length())
                    currentPosition += source.transferTo(currentPosition, 32768, destination);
            } finally {
                if (source != null) {
                    source.close();
                }

            }
        }
navy1978
  • 1,411
  • 1
  • 15
  • 35
  • Why don't you use a byte buffer instead and read the original file in chunks ?Performancewise it is great. – Alexander Petrov Jan 31 '19 at 22:10
  • Read in blocks/chunks, e.g. 64k at a time, using `FileInputStream` and `FileOutputStream` – Andreas Jan 31 '19 at 22:10
  • @AlexandarPetrov can you please provide an example, considering that the destination file has to be written with RandomAccessFile ? Thank you. – navy1978 Jan 31 '19 at 22:13
  • @Andreas same thing for you ;) – navy1978 Jan 31 '19 at 22:13
  • 1
    Why does destination file **have to** be written with `RandomAccessFile`? Aren't you simply concatenating existing files into a combined file? – Andreas Jan 31 '19 at 22:14
  • @Andreas Because the container, contains an header and a concatenation of different files, so I need to seek in the correct position and write from there... – navy1978 Jan 31 '19 at 22:15
  • The concatenation is not really a concatenation (one file after the other) , the header (of the container) contains the offset from where one file starts.... – navy1978 Jan 31 '19 at 22:21

2 Answers2

4

Try using async nio Channel


    public void copyFile(String src, String target) {
        final String fileName = getFileName(src);
        try (FileChannel from = (FileChannel.open(Paths.get(src), StandardOpenOption.READ));
                FileChannel to = (FileChannel.open(Paths.get(target + "/" + fileName), StandardOpenOption.CREATE_NEW, StandardOpenOption.WRITE))) {
            transfer(from, to, 0l, from.size());
        }
    }

    private String getFileName(final String src) {
        File file = new File(src);
        if (file.isFile()) {
            return file.getName();
        } else {
            throw new RuntimeException("src is not a valid file");
        }
    }

    private void transfer(final FileChannel from, final FileChannel to, long position, long size) throws IOException {
        while (position < size) {
            position += from.transferTo(position, Constants.TRANSFER_MAX_SIZE, to);
        }
    }

This will create a read and write async channels, and transfer the data efficiently from the first to the later .

mehdi maick
  • 325
  • 3
  • 7
  • The container (destination file) contains a concatenation of files, I need to seek the correct offset in the container and start to write from there. That's why I asked in the question for an example in which we use RandomAccessFile to write into the destination (the container)...Can you please adapt your example using RandomAccessFile for the destination? – navy1978 Jan 31 '19 at 23:01
  • the `FileChannel` provides a `position(long position)` method to seek the exact required position . – mehdi maick Jan 31 '19 at 23:04
  • I haven't seen it before.. I will upvote you , because I cannot test it now, if it's ok I will accept you answer... Thank you for the moment ;) – navy1978 Jan 31 '19 at 23:10
  • Thank you I accepted your answer and I edit my question with the solution I found... ;) – navy1978 Feb 01 '19 at 21:45
  • glad i could help ;) – mehdi maick Feb 01 '19 at 21:55
-1

Read in blocks/chunks, e.g. 64k at a time, using FileInputStream and FileOutputStream.

If you need to boost performance, you might try using threads, one thread for reading and another thread for writing.

You might also be able to boost performance using direct NIO buffers.
See e.g. A simple rule of when I should use direct buffers with Java NIO for network I/O?

Andreas
  • 154,647
  • 11
  • 152
  • 247
  • what happens with the last 64k chunk if it only contains 32K bytes? it will write the whole 64k (the last 32k with zeros)? – navy1978 Jan 31 '19 at 22:49
  • @navy1978 Why would it write 64k if you only read 32k into the buffer? If it did, you wrote the code wrong. – Andreas Jan 31 '19 at 22:51
  • I'm not following you, you said to read 64K bytes at time not me... I'm asking you what appens with the last chunk if it contains only 32K (the file can indeed be not multiple of 64K) does it make sense to you? – navy1978 Jan 31 '19 at 22:53
  • @navy1978 You have a buffer, e.g 64k in size, you ask to read bytes from input file into the buffer, then you turn around a write the **actual** number of bytes read to the output file. Why on earth would you think that the code writing the bytes don't *know* if the buffer is full? – Andreas Jan 31 '19 at 23:15
  • I’m asking, because I don’t know how it works, in RandomAccessFile the method “write” (one of the methods “write”) takes an array of bytes so I was wondering if passing a 64kb array the last time it would write the full amount of bytes or only the part filled in... that was my doubt ... – navy1978 Jan 31 '19 at 23:21
  • @navy1978 The [`write(...)`](https://docs.oracle.com/javase/8/docs/api/java/io/RandomAccessFile.html#i25) method is overloaded. There are 3 variants of it. – Andreas Jan 31 '19 at 23:23
  • Are you reading me? Read again you will find that I wrote one of the methods “write” do you want to argue without any reason or to help? – navy1978 Jan 31 '19 at 23:24
  • @navy1978 Are you reading *me*? There are **THREE** overloads of the method. The *second* overload takes a `len` argument specifying how many bytes from the buffer array to write. If the buffer isn't full, you use that method to write only part of the buffer. Which part of that is difficult to understand? Did you look at the 3 methods I linked to? Did you maybe think I mentioned and linked to the methods for a reason? – Andreas Jan 31 '19 at 23:29
  • I’m on the phone now and it’s more difficult to follow you, maybe it’s my fault. Do you mean that using the last chunk I have to check how many bytes I have to use of it? Was it not easier to provide an example ? – navy1978 Jan 31 '19 at 23:36
  • `while ((len = read(buffer)) > 0) { write(buffer, 0, len); }` --- You write the number of bytes **actually** read, as [**I've already said**](https://stackoverflow.com/questions/54469628/write-big-files-using-randomaccessfile-class/54470004?noredirect=1#comment95749069_54470004)!!! – Andreas Jan 31 '19 at 23:38