UNIVOCITY-PARSERS for csv to bean object stopping as soon as error has occured

Question

I'm using UNIVOCITY-PARSERS for converting csv file rows into java objects.

while processing the file, if it encounters any problem any of the column in row, then it parsing getting stopped in that row and throwing exception. But i need something which will continue till end of the file just by skipping the row which has error. But i didn't any utility classes in the api.

MY Bean class

public class ItemcodeBean {

@Trim
@NullString(nulls = { " ", "" }) 
@Parsed(field = "ItemCode")
 private String itemCode;

@Trim 
@NullString(nulls = { " ", "" })
@Parsed(field = "PartNumber") 
private String partNumber;

@Trim 
@NullString(nulls = { " ", "" }) 
@Parsed(field = "ModelNumber") 
private String modelNumber;

}

My Main Class

public class TestClass {

    private  BeanListProcessor<ItemcodeBean>
            rowProcessor = null;
    private CsvParser parser = null;
    public static void main(String[] args) {
        TestClass testClass = new TestClass();
        testClass.init();
        try{
            ItemcodeBean itemcodeBean;
            while ((itemcodeBean = testClass.getRowData()) != null){
                System.out.println(itemcodeBean.toString());
            }
        }catch (Throwable ex){
            System.out.println(ex.getLocalizedMessage());
        }

    }

    private BeanListProcessor<ItemcodeBean> init() {
        // BeanListProcessor converts each parsed row to an instance of a given class, then stores each instance into a list.
              this.rowProcessor =
                new BeanListProcessor<ItemcodeBean>(ItemcodeBean.class);

        CsvParserSettings parserSettings = new CsvParserSettings();
        parserSettings.setProcessor(rowProcessor);
        parserSettings.setHeaderExtractionEnabled(true);
        // skip leading whitespaces
        parserSettings.setIgnoreLeadingWhitespaces(true);

        //skip trailing whitespaces
        parserSettings.setIgnoreTrailingWhitespaces(true);
        //skip empty lines
        parserSettings.setSkipEmptyLines(true);

        File file = new File("C:\\Users\\abhishyam.c\\Downloads\\Itemcode_Template.csv");
        this.parser = new CsvParser(parserSettings);
        //parser.parse(file);
        parser.beginParsing(file);
        return rowProcessor;
    }

    private ItemcodeBean getRowData() throws Throwable {
        String[] row;
        try {
            while ((row = parser.parseNext()) != null){
                return rowProcessor.createBean(row, parser.getContext());
            }
        }catch (DataProcessingException e){
            throw new DataProcessingException(e.getColumnName(),e);
        }
       // parser.stopParsing();
        return null;
    }
}

score 4 · Accepted Answer · edited Aug 26 '21 at 08:55

4

Just use an error handler and it will keep going unless you throw the exception yourself:

    //Let's set a RowProcessorErrorHandler to log the error. The parser will keep running.
    settings.setProcessorErrorHandler(new RowProcessorErrorHandler() {
        @Override
        public void handleError(DataProcessingException error, Object[] inputRow, ParsingContext context) {
            println(out, "Error processing row: " + Arrays.toString(inputRow));
            println(out, "Error details: column '" + error.getColumnName() + "' (index " + error.getColumnIndex() + ") has value '" + inputRow[error.getColumnIndex()] + "'");
        }
    });

UPDATE: You can prevent the row to be discarded by using a RetryableErrorHandler instead. This is a special implementation added to version 2.3.0, and allows the user to call the methods setDefaultValue() to assign a value to the problematic column, and keepRecord to prevent the record from being discarded.

Example:

settings.setProcessorErrorHandler(new RetryableErrorHandler<ParsingContext>() {
    @Override
    public void handleError(DataProcessingException error, Object[] inputRow, ParsingContext context) {
        //if there's an error in the first column, assign 50 and proceed with the record.
        if (error.getColumnIndex() == 0) { 
            setDefaultValue(50);
        } else { //else keep the record anyway. Null will be used instead.
            keepRecord();
        }
    }
});

Note that if error.getColumnIndex() returns -1, there's nothing that can be done to save the record, and it will be skipped regardless. You can use this to log the error details.

edited Aug 26 '21 at 08:55

kometen

6,536
6
41
51

answered Aug 01 '16 at 07:36

Jeronimo Backes

6,141
2
25
29

Thanks a lot, for immediate reply. I want to know why the processing of row is getting stopped in as soon as an error found at column, why not to process till end of row and return all the errors in that row? – Abhishyam Aug 01 '16 at 10:36
Simply because it's easier and faster to handle the error. The use case for processing the problematic row on its entirety and storing each error of each column in the exception object is pretty limited. – Jeronimo Backes Aug 01 '16 at 11:22
@JeronimoBackes: Thanks for the information. This worked for me too. But I'm facing another issue. While logging the error, it also re-inserts the previous bean to the list. `CsvParserSettings parserSettings = getCsvParserSettings(); CsvRoutines routines = new CsvRoutines(parserSettings); for (Info bean : routines.iterate(Info.class, getReader(file.getAbsolutePath()))) { infoList.add(bean); log.info("ID:" + bean.getId() + ", Name:" + bean.getName()); }` How can i control it ? Is there any handler reference to check if there is any exception while adding it to list ? – Maverick Aug 19 '16 at 13:15
This is in fact a bug, thanks for letting me know. I've just fixed it in the latest SNAPSHOT build of version 2.2.1. We will release the final 2.2.1 version with this correction (and huge performance improvements as well) in a couple of days. – Jeronimo Backes Aug 20 '16 at 07:44
@JeronimoBackes: Thanks. Please let me know when it is available. I'll also be checking for the updates as I need this Error handling functionality soon. :) – Maverick Aug 21 '16 at 17:43
1

Just released! We're hoping you'll get impressed by how faster it became. – Jeronimo Backes Aug 22 '16 at 03:42
@JeronimoBackes: Thanks for your prompt response. Can you please show me how its implemented ? – Maverick Aug 22 '16 at 14:45
1

@JeronimoBackes: I figured out. I need another help. Is it possible to parse the failing row & set `null` for the column which is failing instead of skipping the entire row? It would be highly appreciated if this also could be handled. – Maverick Aug 22 '16 at 15:01
This is currently not supported. However the exception should give you the row that could not be processed. – Jeronimo Backes Aug 23 '16 at 03:16
1

As of version 2.3.0 this is now supported with the RetryableErrorHandler. I updated my answer to include an example. Cheers! – Jeronimo Backes Jan 13 '17 at 01:44

UNIVOCITY-PARSERS for csv to bean object stopping as soon as error has occured

1 Answers1