I want to read a huge csv
file by Java. It includes 75,000,000 lines. The problem is, even though I am using maximum xms
and xmx
limits, i am getting: `java.lang.OutOfMemoryError(GC overhead limit exceeded), and it shows this line causes the error:
String[][] matrix = new String[counterRow][counterCol];
I did some tests and see that i can read 15,000,000 lines well. Therefore I started to use this sort of code:
String csvFile = "myfile.csv";
List<String[]> rowList = new ArrayList();
String line = "";
String cvsSplitBy = ",";
BufferedReader br = null;
try {
int counterRow = 0, counterCol = 12, id = 0;
br = new BufferedReader(new FileReader(csvFile));
while ((line = br.readLine()) != null) {
String[] object = line.split(cvsSplitBy);
rowList.add(object);
counterRow++;
if (counterRow % 15000000 ==0) {
String[][] matrix = new String[counterRow][counterCol];
.. do processes ..
SaveAsCSV(matrix,id);
counterRow=0; id++; rowList.clear();
}
}
}
...
Here, it writes first 15.000.000 lines very well, but in the second trial, this again gives the same error, although counterRow is 15,000,000.
In summary, I need to read a csv
file that includes 75,000,000 rows (approx 5 GB) in Java and save a new csv
file or files after doing some processes with its records.
How can I solve this problem?
Thanks
EDIT: I am also using rowList.clear() guys, forgot to specify here. sorry.
EDIT 2: My friends, I dont need to put all file in memory. How can I read it part by part. Actually this is what I tried to do by using if(counterRow%15000000==0). What is its correct way?