I have a large text file with 20 million lines of text. When I read the file using the following program, it works just fine, and in fact I can read much larger files with no memory problems.
public static void main(String[] args) throws IOException {
File tempFile = new File("temp.dat");
String tempLine = null;
BufferedReader br = null;
int lineCount = 0;
try {
br = new BufferedReader(new FileReader(tempFile));
while ((tempLine = br.readLine()) != null) {
lineCount += 1;
}
} catch (Exception e) {
System.out.println("br error: " +e.getMessage());
} finally {
br.close();
System.out.println(lineCount + " lines read from file");
}
}
However if I need to append some records to this file before reading it, the BufferedReader consumes a huge amount of memory (I have just used Windows task manager to monitor this, not very scientific I know but it demonstrates the problem). The amended program is below, which is the same as the first one, except I am appending a single record to the file first.
public static void main(String[] args) throws IOException {
File tempFile = new File("temp.dat");
PrintWriter pw = null;
try {
pw = new PrintWriter(new BufferedWriter(new FileWriter(tempFile, true)));
pw.println(" ");
} catch (Exception e) {
System.out.println("pw error: " + e.getMessage());
} finally {
pw.close();
}
String tempLine = null;
BufferedReader br = null;
int lineCount = 0;
try {
br = new BufferedReader(new FileReader(tempFile));
while ((tempLine = br.readLine()) != null) {
lineCount += 1;
}
} catch (Exception e) {
System.out.println("br error: " +e.getMessage());
} finally {
br.close();
System.out.println(lineCount + " lines read from file");
}
}
A screenshot of Windows task manager, where the large bump in the line shows the memory consumption when I run the second version of the program.
So I was able to read this file without running out of memory. But I have much larger files with more than 50 million records, which encounter an out of memory exception when I run this program against them? Can someone explain why the first version of the program works fine on files of any size, but the second program behaves so differently and ends in failure? I am running on Windows 7 with:
java version "1.7.0_05"
Java(TM) SE Runtime Environment (build 1.7.0_05-b05)
Java HotSpot(TM) Client VM (build 23.1-b03, mixed mode, sharing)