13

I've often read that exceptions are somewhat slow and should be avoided if performance is an issue (for instance, in Java, F#, etc). Does that apply to common OCaml functions such as Hashtbl.find, which return exceptions for elements not found?

In particular, if I want my application to be efficient, should I always test element membership using, for instance, Hashtable.mem before calling Hashtbl.find? Or would the extra comparison of the mem function negatively impact performance?

Jack P.
  • 11,487
  • 1
  • 29
  • 34
anol
  • 8,264
  • 3
  • 34
  • 78

3 Answers3

22

OCaml exception handling make raising and catching exceptions extremely fast -- see this SO thread for internal details on how it's implemented. I haven't taken care to benchmark it precisely, but my random guess would be that it is in the ballpark of an indirect function call.

It is known that OCaml exceptions are significantly faster, in proportion to the rest of the language, than exceptions of F# for example -- this gave rise to performance problem for people porting their code from OCaml to F#. In OCaml, exceptions don't cause performance problems.

Calling Hashtbl.mem before Hashtbl.find is likely to be slower than catching the exception. The idiomatic style tends to be try Hashtbl.find .. with Not_found -> ....

That said, there is a sensible movement in the OCaml community to use more explicit error handling style, using option types rather than exceptions. The rationale is not based on performances, but on the fact that the type-checker then stop you from forgetting to handle an error situation. I would prefer that when designing a fresh new API. When using a third-party function that raise exceptions, make sure to catch all possible exceptions immediately; doing otherwise is a design smell in general and should be very heavily justified.

Due to their convenience, OCaml exceptions are also often used as a pure control-flow mechanism (and not to signal a rare failure condition). You will encounter code such as:

try
  for i = 0 to .... do
    if .. then raise Exit
  done; false
with Exit -> true

Finally, I feel that you might be taking a bad approach to implementation choices. Asking general micro-questions about performances is generally not the way to go. Think of correctness and readability first. Performance questions should generally come later, and only in situation where it is measurable/profilable.

Community
  • 1
  • 1
gasche
  • 31,259
  • 3
  • 78
  • 100
  • 1
    Actually, one of the reasons the "standard OCaml way" in this case seemed a bit weird to me is because I've read Effective Java, where Joshua Bloch mentions explicitly such an idiom applied in Java and that it should be avoided. I know OCaml is not Java (thankfully!), but it seemed quite a generic tip applicable to most programming languages. So the fact that OCaml would almost "force" you to do it seemed unnatural to me. Anyway, I agree that optimization comes after correction. I just tend to overuse lists in OCaml due to their simplicity and elegance, which led me to some quadratic issues. – anol Aug 28 '12 at 16:22
  • 1
    Exceptions are an optimization of flow control: it's a nonlocal goto operator in disguise. First, it is important to get the flow control right. Exceptions do not help here, and sometimes harm (if you don't know that some function is throwing an exception). When the performance is not sufficient, we can try to optimize it by putting in some exceptions. Also I would think that Java performance advice does not help in OCaml because of vastly different implementation details. – winitzki Aug 29 '12 at 08:29
  • 2
    I found exceptions to occasionally be helpful and enhance code readability (compared to the plumbing one would need otherwise, especially when library support for monadic style is not completely there yet). I'm not talking about a general case of cross-functions or cross-module exception raise and catch, but specific idioms where the catch site is statically determined and easy to reason about, such as the example that I have demonstrated above. (Just like `goto` is generally a bad idea but can still be useful when used locally in C code, eg. for some styles of error handling.) – gasche Aug 29 '12 at 11:24
5

To first answer the concrete question for Hashtbl.mem before Hashtbl.find: Don't do it as the check for the existence of the element in the table will have to be dont twice; while the cost will be O(1) for both strategies, calling Hashtbl.mem first will compute the hash-value and the lookup twice -- which can take quite longer than fetching an exception.

As a general advice, only create functions that raise exceptions if the probability of an exception is low. The exceptions are not taken into consideration for the type system, so more robust implementations will have an 'a option return value instead of 'a plus a possible exception. The ocaml core library uses a suffix of _exn to make clear that a function may throw an exception but generally prefers the non-exception throwing ones.

So for the hashtable you should (in a perfect world) have two functions:

Hashtbl.find_exn : 'a t -> key -> 'a (* throws Not_found *)

Hashtbl.find : 'a t -> key -> 'a option

If in doubt, use the second one. If it is not for pure speed, you could write a simple wrapper around the original hash-table:

find_safe h k = try Some (Hashtbl.find h k) with Not_found -> None

lambdapower
  • 1,022
  • 6
  • 12
  • 1
    Your answer is quite sensible, but I strongly disagree with the "only ... raise exceptions if the probability of an exception is low" idea. The probability-thing is warranted neither by performance consideration (raising exceptions often is fine performance-wise) nor by correctness considerations (low probability bugs are still bugs). I don't understand where it comes from in the context of OCaml. – gasche Aug 28 '12 at 15:21
  • 1
    You have to think about locality and the error-catching code: If it is a serious error you might want to abandon your current computation and e.g. inform the user. This is very tedious if you want to escape from a series of functions as you have to integrate error-handling in every one of them - not just a try/with block around the whole thing. In case you normally expect something to go "wrong", you will deal with that error more locally (e.g. filter the results). You probably have seen it only in OCaml as most languages lack an ad hoc option-type and loose less type-safety from exceptions. – lambdapower Aug 28 '12 at 15:53
  • 1
    I'm not sure how your (reasonable) comment is related to the previous "low-probability" argument. Are you making an implicit link between "low probability" and "error you can't handle locally and should rather report to the user?". I agree that there are a few situations where giving up and reporting is useful, but I don't see this as related to a "probability of error" measurement, rather a question of problem domain (who has the knowledge/responsibility to handle that?). The point on monadic style having modularity issue is (frustrating but) good. – gasche Aug 28 '12 at 16:13
0

I've often read that exceptions are somewhat slow and should be avoided if performance is an issue (for instance, in Java, F#, etc). Does that apply to common OCaml functions such as Hashtbl.find, which return exceptions for elements not found?

No. Folklore around exceptions being slow and for exceptional circumstances in mainstream languages is nonsense. Exceptions are just another control flow construct. OCaml's exceptions are fast and commonly used for non-exceptional circumstances. Last I looked (a few years ago) exceptions were around 6x faster in OCaml than C++, around 100x faster than Java and around 600x faster than .NET.

People sometimes avoid exceptions in OCaml not for performance-related reasons but because they want explicit local control flow, typically forcing the caller to match over a success/failure union type instead of potentially non-local exception propagation. They do this to improve correctness, which is more important than performance.

J D
  • 48,105
  • 13
  • 171
  • 274