2

I'm writing a genetic algorithm that needs to read/write lots of files. The fitness test for the GA is invoking a program called gradif, which takes a file as input and produces a file as output.

Everything is working except when I make the population size and/or the total number of generations of the genetic algorithm too large. Then, after so many generations, I start getting this: java.io.FileNotFoundException: testfiles/GradifOut29 (Too many open files). (I get it repeatedly for many different files, the index 29 was just the one that came up first last time I ran it). It's strange because I'm not getting the error after the first or second generation, but after a significant amount of generations, which would suggest that each generation opens up more files that it doesn't close. But as far as I can tell I'm closing all of the files.

The way the code is set up is the main() function is in the Population class, and the Population class contains an array of Individuals. Here's my code:

Initial creation of input files (they're random access so that I could reuse the same file across multiple generations)

files = new RandomAccessFile[popSize];

for(int i=0; i<popSize; i++){
    files[i] = new RandomAccessFile("testfiles/GradifIn"+i, "rw");
}

At the end of the entire program:

for(int i=0; i<individuals.length; i++){
    files[i].close();
}

Inside the Individual's fitness test:

FileInputStream fin = new FileInputStream("testfiles/GradifIn"+index);
FileOutputStream fout = new FileOutputStream("testfiles/GradifOut"+index);
Process process = Runtime.getRuntime().exec ("./gradif");
OutputStream stdin = process.getOutputStream();
InputStream stdout = process.getInputStream();

Then, later....

try{
      fin.close();
  fout.close();
  stdin.close();
  stdout.close();
      process.getErrorStream().close();
}catch (IOException ioe){
    ioe.printStackTrace();
}

Then, afterwards, I append an 'END' to the files to make parsing them easier.

FileWriter writer = new FileWriter("testfiles/GradifOut"+index, true);
writer.write("END");
try{
   writer.close();
}catch(IOException ioe){
   ioe.printStackTrace();
}

My redirection of stdin and stdout for gradif are from this answer. I tried using the try{close()}catch{} syntax to see if there was a problem with closing any of the files (there wasn't), and I got that from this answer.

It should also be noted that the Individuals' fitness tests run concurrently.

UPDATE: I've actually been able to narrow it down to the exec() call. In my most recent run, I first ran in to trouble at generation 733 (with a population size of 100). Why are the earlier generations fine? I don't understand why, if there's no leaking, the algorithm should be able to pass earlier generations but fail on later generations. And if there is leaking, then where is it coming from?

UPDATE2: In trying to figure out what's going on here, I would like to be able to see (preferably in real-time) how many files the JVM has open at any given point. Is there an easy way to do that?

Community
  • 1
  • 1
MattS
  • 717
  • 2
  • 7
  • 22
  • @Thierry I did that as best I could just repeatedly doing that in the terminal. But is there some way I could set up something that would show the file count of lsof in real time without me having to keep mashing the 'up, enter' keys? – MattS Jul 09 '12 at 21:20
  • Yes this is the watch command : "watch. Execute a program periodically, showing output full screen.". default interval is 2 sec, but you can change it. – Thierry Jul 10 '12 at 09:47

4 Answers4

1

Perhaps it is a good idea to put all you actions inside the loop :

while(selection_ of_file.hasNext()){
File are new randomFile
open inputFile
open outPufile
read from inputFile
write to outputFile
close inputFile
close outputFile
}
cl-r
  • 1,264
  • 1
  • 12
  • 26
1

try to close the error stream too :

process.getErrorStream().close();

EDIT: Well in fact you should read it too, as a buffer full on error stream will block the child process.

look at a StreamGobbler implementation here : Need sample Java code to run a shellscript

EDIT 2 : Is there a population size (small enough) for which whatever the generation count, you do not encounter the issue ? If it is the case, you might not be leaking anymore open file / streams.

In that case, you have two solutions :

  • Either rewrite your algorithm to not keep all the population file open at the same time
  • Or increase the maximum number of allowed open files. See here for some way of doing that
Community
  • 1
  • 1
Thierry
  • 5,270
  • 33
  • 39
0

you seem to be running on linux (or some unix-like operating system). you can use something like the "lsof" command to figure out what files your application has open when you get the error.

jtahlborn
  • 52,909
  • 5
  • 76
  • 118
  • I've looked at that. The list of open files seems to be what it should look like. Again, this wouldn't be an issue except that the error comes after a number of generations (like, 50 generations in). This would imply to me that every generation it's opening more files and not closing previous ones, but that didn't seem to be the case. – MattS Jul 05 '12 at 03:30
  • @MattS - have you looked through the system to see if some _other_ process is consuming all the open files? – jtahlborn Jul 05 '12 at 13:02
  • There were a bunch of other files open, but this is running on a server. – MattS Jul 05 '12 at 18:24
0

If you're sure you're closing all files, etc, maybe try upping the ulimit. I had an issue once where a java program kept bumping into the ulimit ceiling. Increasing it resolved my issue. I think it may require a server restart since it's a kernel param.

Alper Akture
  • 2,445
  • 1
  • 30
  • 44
  • But that doesn't address the issue that if I can make it past even just the first generation, then I should be able to make it past any amount of generations. – MattS Jul 06 '12 at 21:42