11

I have a few nested records that I need to validate, and I wonder what is an idiomatic Haskell way to do it.

To simplify:

data Record = Record {
  recordItemsA :: [ItemA],
  recordItemB :: ItemB
} deriving (Show)

data ItemA {
  itemAItemsC :: [ItemC]
} deriving (Show)

Requirements are:

  • Collect and return all validation errors
  • Some validations may be across items, e.g. ItemsA against ItemB
  • Strings are sufficient to represent errors

I currently have code that feels awkward:

type ErrorMsg = String

validate :: Record -> [ErrorMsg]
validate record =
  recordValidations ++ itemAValidations ++ itemBValidations
  where
    recordValidations :: [ErrorMsg]
    recordValidations = ensure (...) $
      "Invalid combination: " ++ (show $ recordItemsA record) ++ " and " ++ (show $ recordItemsB record)
    itemAValidations :: [ErrorMsg]
    itemAValidations = concat $ map validateItemA $ recordItemsA record
    validateItemA :: ItemA -> [ErrorMsg]
    validateItemA itemA = ensure (...) $
      "Invalid itemA: " ++ (show itemA)
    itemBValidations :: [ErrorMsg]
    itemBValidations = validateItemB $ recordItemB record
    validateItemB :: ItemB -> [ErroMsg]
    validateItemB itemB = ensure (...) $
      "Invalid itemB: " ++ (show itemB)

ensure :: Bool -> ErrorMsg -> [ErrorMsg]
ensure b msg = if b then [] else [msg]

4 Answers4

5

What you have already is basically fine, it just needs some clean-up:

  • The sub-validations should be top-level definitions, as they're fairly involved. (By the way, type signatures on where clause definitions are usually omitted.)
  • Lack of consistent naming convention
  • Lots of (++)s in sequence can get ugly — use concat (or perhaps unwords) instead
  • Minor formatting quirks (there are some superfluous parentheses, concat . map f is concatMap f, etc.)

The product of all this:

validateRecord :: Record -> [ErrorMsg]
validateRecord record = concat
  [ ensure (...) . concat $
      [ "Invalid combination: ", show (recordItemsA record)
      , " and ", show (recordItemB record)
      ]
  , concatMap validateItemA $ recordItemsA record
  , validateItemB $ recordItemB record
  ]

validateItemA :: ItemA -> [ErrorMsg]
validateItemA itemA = ensure (...) $ "Invalid itemA: " ++ show itemA

validateItemB :: ItemB -> [ErrorMsg]
validateItemB itemB = ensure (...) $ "Invalid itemB: " ++ show itemB

I think that's pretty good. If you don't like the list notation, you can use the Writer [ErrorMsg] monad:

validateRecord :: Record -> Writer [ErrorMsg] ()
validateRecord record = do
  ensure (...) . concat $
    [ "Invalid combination: ", show (recordItemsA record)
    , " and ", show (recordItemB record)
    ]
  mapM_ validateItemA $ recordItemsA record
  validateItemB $ recordItemB record

validateItemA :: ItemA -> Writer [ErrorMsg] ()
validateItemA itemA = ensure (...) $ "Invalid itemA: " ++ show itemA

validateItemB :: ItemB -> Writer [ErrorMsg] ()
validateItemB itemB = ensure (...) $ "Invalid itemB: " ++ show itemB

ensure :: Bool -> ErrorMsg -> Writer [ErrorMsg] ()
ensure b msg = unless b $ tell [msg]
ehird
  • 40,602
  • 3
  • 180
  • 182
  • Is this true? See http://stackoverflow.com/questions/8731858/does-writer-monad-guarantee-right-associative-concatenation – pat Jan 04 '12 at 18:18
  • @pat: Huh, right you are. I've removed the staetment from my answer. – ehird Jan 04 '12 at 18:20
  • You should use `Data.Sequence` and replace `[ErrorMsg]` with `(Seq ErrorMsg)` as the `Monoid`. Then, when the `Writer` has finished, you can turn the `Seq ErrorMsg` into a `[ErrorMsg]` with `Data.Foldable.toList`. – pat Jan 04 '12 at 18:59
  • 1
    A `Seq` would probably not be ideal due to constant factors, but a difference list would be ideal here. Still, premature optimisation and all that :) – ehird Jan 04 '12 at 19:06
  • Yes, you're right; see [here](http://book.realworldhaskell.org/read/programming-with-monads.html#monadcase.writer.dlist) for more info. The difference list package is [here](http://hackage.haskell.org/packages/archive/dlist/0.5/doc/html/Data-DList.html) – pat Jan 04 '12 at 20:23
5

Read the 8 ways to report errors in Haskell article. For your particular case, as you need to collect all errors and not only the first one, the approach with Writer monad suggested by @ehird seems to fit best, but it's good to know other common approaches.

nponeccop
  • 13,527
  • 1
  • 44
  • 106
0

Building on @ehird's answer, you could introduce a Validate typeclass:

class Validate a where
  validate :: a -> [ErrorMsg]

instance Validate a => Validate [a] where
  validate = concatMap validate

instance Validate Record where
  validate record = concat
    [ ensure (...) . concat $
      [ "Invalid combination: ", show (recordItemsA record)
      , " and ", show (recordItemB record)
      ]
    , validate $ recordItemsA record
    , validate $ recordItemB record
    ]

instance Validate ItemA where
  validate itemA = ensure (...) $ "Invalid itemA: " ++ show itemA

instance Validate ItemB where
  validate itemB = ensure (...) $ "Invalid itemB: " ++ show itemB
pat
  • 12,587
  • 1
  • 23
  • 52
  • 2
    I don't think this is necessarily a good idea; plain functions keep things simpler, and if there's ever two different kinds of validation that can be applied to a single type, this falls down. The lifting to lists is clever, though. – ehird Jan 04 '12 at 03:52
  • True, but couldn't one make the same argument about any typeclass...? i.e. what if there are ever two different kinds of show that can be applied to a single type? – pat Jan 04 '12 at 03:57
  • 1
    Indeed, that's why I'm conservative about using typeclasses :) `Show` has the constraint that it's basically just for debugging and quick hacks, its output should be syntactically-valid Haskell, and it should preferably be semantically-valid Haskell that evaluates to a value equal to the argument passed to `show`. Most wishes for "alternate `Show` instances" are trying to go against these informal constraints. It's about trade-offs; e.g. there isn't much desire to use two sets of numeric functions on the same type is and if there is, it's heavily outweighed by the convenience of `Num`. – ehird Jan 04 '12 at 04:01
  • Yes, I see. The real power of type classes is in being able to write generic functions that use them as contexts in their type signatures. With the exception of container instances that simply broadcast the function to sub-items (as in the list instance of `Validate` above), it would probably be pretty rare to find a function that would want to `validate` some piece of data, knowing only that it was an instance of `Validate`. Usually, such a function would know the specific type of the data, which renders the typeclass moot. – pat Jan 04 '12 at 05:26
0

One thing you might consider trying is, rather than validating your data afterwards, use lenses from the excellent fclabels package as your interface to your data (rather than pattern-matching/type constructors) to ensure that your data is always correct.

Check out the variant that supports failure here and build your lens by passing a setter and getter that do some validation on the datatype to the lens function.

If you need some more complicated error reporting or whatnot, take a look at the implementation of the Maybe variant of lens and define your lens in terms of the abstract interface.

jberryman
  • 16,334
  • 5
  • 42
  • 83