37

What have people used to catch, log, and report multiple data validation errors at once in Python?

I'm building an application in Python 3 that first validates input data and then processes it. Reporting errors in the first step is part of the intended functionality of the program, so I don't want my validator to give up on the first exception. In particular, the data are tabular and I want be able to return -- rather than raise -- an exception for each line of the table that does not validate.

A forum discussion from a couple of years ago contemplates multiple solutions, including the following, which seems the cleanest to me:

errors = []
for item in data:
    try:
        process(item)
    except ValidationError as e:
        errors.append(e)
if errors:
    raise MultipleValidationErrors(errors)

where the MultipleValidationErrors class would have an appropriate __str__ method to list useful information about all the ValidationErrors in it.

Others recommend using the traceback module, but since the exceptions I want to catch are data validation errors rather than program errors, that seems inappropriate. Getting the logging module involved might be appropriate, though.

wkschwartz
  • 3,817
  • 2
  • 29
  • 33
  • 1
    Maybe [this question](http://stackoverflow.com/questions/6470428/catch-multiple-exceptions-in-one-line-except-block) would help – inspectorG4dget Mar 26 '12 at 16:10
  • 4
    @inspectorG4dget: that question is about catching multiple types of exceptions; this one is about catching multiple instances of the same exception type. – Fred Foo Mar 26 '12 at 16:12
  • The code you list will work, if it does what you want, or you could use the traceback module to provide even more information, as you mention. Alternatively you could save the exception object in a data structure for later use, or do many other things. It all depends on what your requirements for responding to the exceptions are, which are not clear from your question. – Aaron Watters Mar 26 '12 at 16:16
  • @Aaron-Watters My goal is that once I've processed all the incoming data, I can print all of the errors, one line of error per line of incoming data. (More specifically, since I'm writing the library and user interface separately, my library would return the data structure of errors to the user interface which could do whatever it wants. A CLI would print the errors to `stdout`, whereas a GUI would provide some interactive table interface.) – wkschwartz Mar 26 '12 at 17:36
  • Too bad the forum post didn't actually discuss multiple solutions. Instead it was the OP solution and then ratholing on tabs vs. spaces. – Randy Mar 09 '17 at 15:45

2 Answers2

9

I've used this idiom in both C++ and Python. It's the cleanest solution that I know of when what you want is an exception, rather than a log message. The downside to it is that the combined exception takes up linear space in general, which can be problematic when processing large datasets with many errors.

Fred Foo
  • 355,277
  • 75
  • 744
  • 836
  • 5
    This answer seems to be missing its context when read standalone. I don't know what idiom you are referring to. – ThorSummoner Jan 08 '15 at 18:24
  • 2
    @ThorSummoner I think larsmans is referring to how the OP collects the different exceptions in a list, and then raises a single exception that contains the list. – Pedro Dec 22 '15 at 16:03
2

I follow the list of errors approach but contain it in an object like this:

class Multi_Error(Exception):
    def __init__(self, errors: list[Exception]) -> None:
        self.errors = errors
        super().__init__(self.errors)

    def __str__(self) -> str:
        return "\n".join([str(x) for x in self.errors])
Sean D
  • 171
  • 1
  • 5