-2

I have a CSV file containing customer info, one customer each row.

The CSV file has a size of about 170,000 lines.

The app first parsed the whole file line by line and saved each line as a Customer object into an ArrayList. It implied that the size of the list would also be in the order of 170k.

The code is like the below:

final class CustomerInfoLineProcessor implements LineProcessor<CustomerInfo> {    
    ...
    @Override
    public boolean processLine(final String line) {
        parseLine(line);
        return true;
    }

    private void parseLine(final String line) {
        try {
            if (!line.trim().isEmpty()) {
                  //do job
            }
        } catch (final RuntimeException e) {
            handleLineError(e.getClass().getName() + ": " + e.getMessage(), e, lineStatus);
        }
    }
    ...
}

It was found intermittently that the parsing process ended abnormally in the middle. No errors or runtime exceptions were thrown. The whole process was also not stopped. The app kept doing further jobs based on whatever inside the ArrayList.

In the beginning, I thought there might be some invisible characters hidden somewhere in the file, which caused the process quit early. But the possibility was excluded after the same file was tested without any problem by the same app on my test machine.

The second guess was: the memory setting -Xmx256m was too small, thus I changed it to an even smaller one, -Xmx128m. The app immediately threw an OutOfMemoryError, and the app was terminated automatically. It implied that the memory usage of -Xmx256m seemed not to be an issue.

Any other reasons I have not yet thought about?

mikej1688
  • 193
  • 1
  • 2
  • 10
  • Possible duplicate of [How much data can a List can hold at the maximum?](https://stackoverflow.com/questions/3767979/how-much-data-can-a-list-can-hold-at-the-maximum) – fantaghirocco Apr 09 '19 at 15:16
  • Maybe post the code that stores the customer into the ArrayList, if you think that this is where it's failing? – Jerome Apr 09 '19 at 15:19
  • 1
    @fantaghirocco: *Definitely* not a dupe. – Makoto Apr 09 '19 at 15:20
  • 1
    So I suppose one piece of information we're missing is how the file itself is opened and read in. This looks to be the "meat" of the code which handles the parsing, but if you're opening and loading the entire file into memory at once, that might explain why you're getting an `OutOfMemoryError` if your memory is set too low. – Makoto Apr 09 '19 at 15:21
  • This is an existing app. The file is read in like, try (final Reader reader = Files.newReader(customerFile, Charsets.UTF_8)) { customerSync.executeSync(affiliateId, reader); } Here Files is in google's common package, not that in java 8. – mikej1688 Apr 09 '19 at 15:39
  • Are you tried with BufferedReader? See it example thats read a csv file: http://www.java67.com/2015/08/how-to-load-data-from-csv-file-in-java.html – Alvaro Gili Apr 09 '19 at 15:19
  • You say intermittently. Does it consistently fail on the same CSV file? If not, it can't be anything about the contents of the file. – David Conrad Apr 09 '19 at 15:40
  • no, sometimes failed, other times not. It confused me. – mikej1688 Apr 09 '19 at 15:42
  • I don't like the current handling of the file: parsing each line of the file and saving the result into a list, which made the list's size huge; then syncing the data in the list one by one with the database. In this way, it used a huge chunk of memory. – mikej1688 Apr 09 '19 at 15:46
  • need to find a way, like streaming file data and handle it piece by piece. However, we use java 7, not java 8. – mikej1688 Apr 09 '19 at 15:47

1 Answers1

-1

Here is the problem found. * the client's app ftp the csv file to us in a specified folder every morning; * then the file_sync app started parsing the cvs file; * sometimes the cvs file's ftp transferring was not complete while the file_sync app was kicked started. It caused the problem.

Thus the solution is to make sure the csv file is not being opened by another process before starting the file_sync app.

mikej1688
  • 193
  • 1
  • 2
  • 10