9

I have the code below, which simply reads all the files from a folder. There are 20,000 files in this folder. The code works good on a local folder (d:/files), but fails on a network path (//robot/files) after reading about 1,000 - 2,000 files.

Update: the folders are copies of each other.

What causes this problem and how to fix it?

package cef_debug;

import java.io.*;

public class Main {

    public static void main(String[] args) throws Throwable {
        String folder = args[0];
        File[] files = (new File(folder)).listFiles();
        String line;
        for (int i = 0; i < files.length; i++) {
            BufferedReader br = new BufferedReader(new FileReader(files[i]));
            while ((line = br.readLine()) != null) {
            }
            br.close();
        }
    }
}

I get the following error when reading from a network path (//robot/files):

Exception in thread "main" java.io.IOException: Too many open files
        at java.io.FileInputStream.open(Native Method)
        at java.io.FileInputStream.<init>(FileInputStream.java:106)
        at java.io.FileReader.<init>(FileReader.java:55)
        at cef_debug.Main.main(Main.java:12)
Java Result: 1

Line 12 is the line:

BufferedReader br = new BufferedReader(new FileReader(files[i]));
Serg
  • 13,470
  • 8
  • 36
  • 47
  • 1
    Instead of just throwing away exceptions and leaving connection in unkownstate, I would add try/catch/finally make sure all open connections are closed. – kosa Nov 15 '12 at 16:03
  • 2
    inside your loop you should be closing all resources. I think you are leaving the FileReader open because you are only closing the BufferedReader... you should try catch finally and in finally close those resources. – chrislhardin Nov 15 '12 at 16:05
  • 3
    @chrislhardin according to http://stackoverflow.com/questions/1388602/do-i-need-to-close-both-filereader-and-bufferedreader this is not true. Good questing, imho. – b.buchhold Nov 15 '12 at 16:08
  • @chrislhardin, here is the description of close() of BufferedReader from documantation: "Closes the stream and releases any system resources associated with it." – Serg Nov 15 '12 at 16:12
  • 1
    What happens if you change your call to `File[] files = (new File(folder)).listFiles();` to `String[] files = (new File(folder)).list();`? Does it fix your problem? – Laf Nov 15 '12 at 16:23
  • 1
    Add a small delay after `br.close();` and see what happens. – Piotr Praszmo Nov 15 '12 at 16:28
  • @Laf, I checked it, and get exactly the same error on network path only. – Serg Nov 15 '12 at 16:35
  • @Banthar, I added `Thread.sleep(10);` after `br.close();`. No effect. BTW, the two servers are standing one on the other and connected with 1Gb. – Serg Nov 15 '12 at 16:40
  • 1
    Try [Process Explorer](http://technet.microsoft.com/pl-pl/sysinternals/bb896653.aspx) or [Process Monitor](http://technet.microsoft.com/pl-pl/sysinternals/bb896645.aspx). Check which files are open. – Piotr Praszmo Nov 15 '12 at 16:50
  • 2
    @Serg can you add your windows version + java version – eis Nov 15 '12 at 18:17
  • This looks definitely like a bug in either Java or the operating systems' network code. I'd try to debug into the `close()` and see what happens. Anyway, you should make sure you have the latest Java version installed. – Axel Nov 15 '12 at 18:26
  • @eis, on both servers the java is jdk1.6.0_23, and windows 7 – Serg Nov 15 '12 at 19:33
  • I was unable to reproduce this with jdk7u9. Maybe try this version? – Piotr Praszmo Nov 15 '12 at 20:48

2 Answers2

3

There is a documented bug for some java versions and some file opens to hit a limit of 2035. It is possible that you might've just hit that.

From the comments:

To clarify the issue, on win32 system there are three ways to open a file:

1: Using Win32 API

2: Using MFC class framework lib.

3: using C-Library API (open() and fopen())

Other than the third option, i.e. option 1 and 2 have practically no limitation in opening number of files. The third method is restricted (for the reason not known to me) to open only approx. 2035 files. That is why MS JVM is able to open unlimited (practically) files, but SUN JVM fails after 2035 files (my guess is it is using 3rd method to open file).

Now, this is an old issue fixed quite some time ago, but it is possible that they would be using the same function on network access, where the bug could still exist.

Even without closing the handle or the stream, windows should be able to open >10000 file handles and keep them open, as demonstrated by this test code in the bug comments:

import java.util.*;
import java.io.*;

// if run with "java maxfiles 10000", will create 10k files in the current folder
public class maxfiles
{
    static int count = 0;
    static List files = new ArrayList();

    public static void main(String args[]) throws Exception
    {
        for (int n = 0; n < Integer.parseInt(args[0]); n++) {
            File f = new File("file" + count++);
            //save ref, so not gc'ed
            files.add(new PrintStream(new FileOutputStream(f)));
        }
        Iterator it = files.iterator();
        while (it.hasNext()) {
            PrintStream out = ( PrintStream) it.next();
            out.println("foo");
            out.flush();
        }
        System.out.println("current files open: " + files.size());
    } //~main
}

You could test running it on the network share, and report a bug if it fails. You could also try with a different JDK. At least with OpenJDK source, I couldn't see any other calls except WinAPI calls, so I'd try if the behaviour is the same.

Community
  • 1
  • 1
eis
  • 51,991
  • 13
  • 150
  • 199
  • Also, I bet the new Java 7 file API might also perform better. – djangofan Nov 15 '12 at 18:40
  • @eis, actually as you see in the code, I don't open more than one file simultaneously (and I don't wish to). For some reason I receive the error as if I do. I suspect that the `close()` releases the thread before it actually closes the file, so the `new BufferedReader()` begins before closing the last file. – Serg Nov 15 '12 at 19:42
  • @Serg It is possible that it doesn't matterif it's simultanious or not. but I would suggest you test a) different JDK and b) having the read on a separate method, getting file name as a parameter, so you can isolate if it's a gc issue (it's easier for the JVM to do gc if the reader is only defined in a separate method) – eis Nov 15 '12 at 19:57
1

Try:

package cef_debug;

import java.io.*;

public class Main {

    public static void main(String[] args) throws Throwable {
        String folder = args[0];
        File[] files = (new File(folder)).listFiles();
        String line;
        for (int i = 0; i < files.length; i++) {
            try {
                BufferedReader br = new BufferedReader(new FileReader(files[i]));
                while ((line = br.readLine()) != null) {
                }
            } finally {
                try {
                    if (br != null){
                        br.close();
                    }
                } catch (IOException e) {
                    e.printStackTrace();
                }
            }
        }
    }
}
codeghost
  • 1,014
  • 7
  • 14