29

I am an OCaml newbie. I like OCaml's speed but I don't fully understand its design. For example, I would like the + operator to be polymorphic to support integer, float and so on.

Why do we need +.?

Guy Coder
  • 24,501
  • 8
  • 71
  • 136
z_axis
  • 8,272
  • 7
  • 41
  • 61
  • 4
    The title of your question claims it's going to be about syntax, and then the only example of flaw you give is about the type system. Have a look at the StackOverflow questions for "[C] strange behavior". Many of them are caused by C's type system. `float f = 3 / 7;` sets `f` to zero. `sizeof(int) - 5` is not `-1`. Well, OCaml does not use this cursed system. The question should be, how come so many languages still use it when it puzzles so many people? – Pascal Cuoq Nov 05 '11 at 08:53
  • 11
    This is a great question and I am very disappointed to see it closed. Many objective statements can be made about this. The separation of `+` and `+.` makes type inference simpler and more predictable. The alternatives are less predictable (defaults) or potentially *much* less efficient (dispatch). Then there is the question of whether or not it would be an abuse of overloading given that the functions have different characteristics (e.g. associativity) or even entirely different purposes (division vs Euclidean quotient). – J D Mar 06 '12 at 23:37

3 Answers3

28

I would like the '+' operator to be polymorphic to support integer, float and so on. Why do we need '+.'?

Excellent question. There are many subtle trade-offs involved here.

The advantages of not overloading operators (as in OCaml) are:

  • Type inference is simpler and more predictable.
  • Code is more composable: moving code from one place to another cannot affect its meaning.
  • Predictable performance: you always know exactly which function is being invoked.

The disadvantages are:

  • Number of different operators quickly gets out of control: + for int, +. for float, +/ for arbitrary-precision rationals, +| for vectors, +|| for matrices and the complex numbers, low-dimensional vectors and matrices, homogeneous coordinates etc.

Some alternatives are:

J D
  • 48,105
  • 13
  • 171
  • 274
  • 3
    Type classes need not be resolved at run-time. There are also pragmas to guarantee compile-time resolution. Type-class methods can carry meaning, so code should not change meaning in a different context. Type inference is predictable and comprehensive in their presence, too. The problem is not just explosion of operators, by the way, but also difficulty of writing generic code. A function that uses polymorphic operators will be more useful and more reusable than one hard-coded to a certain type. – Peaker Feb 06 '13 at 22:25
  • @Peaker "Type classes need not be resolved at run-time". How do you resolve all type classes at compile time when code can be loaded dynamically? – J D Feb 07 '13 at 13:23
  • 1
    @Peaker "A function that uses polymorphic operators will be more useful and more reusable than one hard-coded to a certain type". Arithmetic is the most common source of overloaded operators. Ints and floats are the most common numerical types. Numerical methods over ints and floats have almost nothing in common because the semantics of the overloaded operators are different between those numerical types, primarily due to rounding in floating point arithmetic. So type classes let you factor out commonality but there is little commonality there to be factored out. – J D Feb 07 '13 at 13:26
  • You can use SPECIALIZE pragmas and give concrete types. You can leave it to runtime, and if there's no way to know the types until runtime, it will be left for runtime. That's a feature, not a bug. – Peaker Feb 07 '13 at 16:23
  • 1
    I'm not sure arithmetic is the most common source of overloaded operators. There's also (>>=), (>>), (<>) and many other non-arithmetic overloads. I agree Ints and Floats are not that similar, and indeed they use different type-classes. But Int, {Int,Word}{8,16,32,64}, Integer are all very similar types. Float and Double have a lot of commonality. – Peaker Feb 07 '13 at 16:25
  • "You can use SPECIALIZE pragmas and give concrete types". And can you resolve all type classes at compile time if code can be loaded dynamically? – J D Feb 07 '13 at 18:22
  • "There's also (>>=), (>>)...". Maybe in Haskell but I'm pretty sure addition is more common than monadic bind in most other languages. :-) – J D Feb 07 '13 at 18:46
  • One of the reasons monadic bind is less common, is that they don't have type-classes to express monadic bind :) As for dynamic code loading, I haven't used it myself, so I don't know how it interacts with type specialization. – Peaker Feb 07 '13 at 20:29
  • Does F# have various integral types like Int8, Int16, etc? And Float vs. Double? Do you not see a value in abstracting over those? – Peaker Feb 07 '13 at 20:30
  • "As for dynamic code loading, I haven't used it myself, so I don't know how it interacts with type specialization". I *think* statically resolving all type classes in general conflicts with separate compilation. However, the same is true of reified generics which .NET solves by referring reification to link-time with a JIT compiler. – J D Feb 08 '13 at 01:32
  • "Does F# have various integral types like Int8, Int16, etc? And Float vs. Double? Do you not see a value in abstracting over those?". I've done a *lot* of work with those over the past 30 years and I don't recall a single instance where that would have been valuable. For example, higher-order entropy encoders construct a trie with a dictionary in each node with an integer type as the key. If its a 1-bit int then you use a binary tree. For 8-bit ints you use a 256-element array. For 32-bit ints a hash table. There is no commonality to factor out. – J D Feb 08 '13 at 01:34
  • The 32- and 64-bit floating point types have more in common but in practice you use double precision for everything unless you're memory constrained so you use 32-bit floats as a storage format or you want more ILP from SSE, both of which are obscure specialist applications. – J D Feb 08 '13 at 01:36
  • A binary tree can be based on array of two. Also, there are arithmetic functions you might want to use with Int of various sizes. Double/float may also be traded off for speed, and you might want different trade offs in different parts of program, while reusing common code, even for arithmetic. – Peaker Feb 09 '13 at 12:59
  • @Peaker Those are certainly valid theoretical examples but my impression is that they occur so rarely in practice that it is not worth complicating the language with a feature to help in these cases. – J D Feb 10 '13 at 01:24
  • I think the rarity of the need for a feature also relates to its availability. You'll always find ways to work around the unavailability of a feature until it becomes the natural way to do things. Then you might get the false sense that the feature is never necessary. Though if it existed, suddenly you'd need it because you'll see various roles it fills more naturally and better than other solutions you've found. Do you interface with FFI code that happens to use various Int sizes, btw? – Peaker Feb 23 '13 at 23:45
  • @Peaker: "I think the rarity of the need for a feature also relates to its availability. You'll always find ways to work around the unavailability of a feature until it becomes the natural way to do things. Then you might get the false sense that the feature is never necessary. Though if it existed, suddenly you'd need it because you'll see various roles it fills more naturally and better than other solutions you've found. Do you interface with FFI code that happens to use various Int sizes, btw?". That can be said of any feature. I've barely used FFI in the past decade: everything is on .NET. – J D May 26 '17 at 22:38
12

OCaml does not support polymorphic operators (numeric or otherwise) other than comparison operators. The + versus +. thing removes a lot of subtle bugs which can crop up in converting different sizes of integers, floats, and other numeric types back and forth. It also means that the compiler always knows exactly which numeric type is in use, thus making it easier to recognize when the programmer has made incorrect assumptions about a number always having an integer value. Requiring explicit casting between numeric types may seem awkward, but in the long run, it probably saves you more time tracking down weird bugs than you have to spend to write that extra period to be explicit.

Aside from the . versions of the numeric operators, I do not think that the OCaml syntax is particularly strange. It is very much in line with previous ML languages with appropriate and reasonable syntax extensions for its added features. If it initially seems odd to you, that probably simply indicates that you have been, thus far, only been programming in languages with closely related syntax. As you learn new languages, you will see that there are many different ways to have language syntax with different benefits and detriments, but a lot of it is just arbitrary conventions which someone decided on.

Keith Irwin
  • 5,628
  • 22
  • 31
  • 19
    "The + versus +. thing removes a lot of subtle bugs". If that were true, F# code would suffer from those subtle bugs but it does not. In reality, the bugs are caused almost entirely by implicit casts (e.g. 2.3/0 or 1/12*123.456) and not by overloading. F# does not do implicit casts, i.e. `+` is `int -> int -> int` or `float -> float -> float` but not `int -> float -> float`. So that is not a valid motive for OCaml's choice. – J D Mar 07 '12 at 09:12
  • 1
    "Aside from the . versions of the numeric operators". And don't forget the `.` versions are specifically for the `float` numeric type. OCaml also provides `+/` for arbitrary-precision addition and I used to have to write my own `+|` version for vectors and `+||` version for matrices, which gets ridiculous really quickly. And OCaml doesn't even offer most numeric types like 32-bit floats... – J D Mar 07 '12 at 09:17
  • 3
    This argument looks reasonable only until you start to count for all other compromises that F# had to do in order to support this overloading. – ygrek Mar 07 '12 at 10:10
  • 1
    You're right that it is really implicit casting which causes most of the subtle bugs, and that you can do polymorphic operators without implicit casting which are not as worrisome. But once you allow polymorphic operators, in general, then people defining their own can create int -> int -> int, float -> float -> float, and int -> float -> float operations. The only way to prevent this is to only allow polymorphic operators for the standard libraries or some other such special casing. It makes sense to have uniform rules. – Keith Irwin Mar 07 '12 at 19:41
  • 1
    Some [other discussion](http://lambda-the-ultimate.org/node/1655) of reasons. [_Modular Implicits_](https://arxiv.org/pdf/1512.01895.pdf) is a [proposal to](https://discuss.ocaml.org/t/modular-implicits/144/9) add overloading to OCaml. [_Modular Type Classes_](https://people.mpi-sws.org/~dreyer/papers/mtc/main-long.pdf) is (only) a theory for potentially adding typeclasses to a ML-like language. – Shelby Moore III Feb 01 '18 at 12:24
4

Basically, the type systems of SML and OCaml (ignoring the object system and modules) do not support ad hoc polymorphism. This is a design decision. OCaml also, unlike SML, decided against including syntactic sugar for arithmetic.

Some languages in the ML family have extremely limited forms of ad hoc polymorphism in numeric operators. For instance, (+) in Standard ML has a default type of (int, int) -> int, but has type (float, float) -> float if its argument or return type is known to be float. These operators are special in SML, and could not be defined if they were not already built in. It also isn't possible to endow other values with this property. val add = (+) would have type (int, int) -> int. The specialness here is confined to the syntax of the language, there is no support for ad hoc polymorphism in the type system.

OCaml has a handful of operators with special semantics (but numeric operators are not among them), for instance || and && are short-circuit (but become long-circuit if you assign them to an intermediate value)

let long_circuit_or = (||);;
let print_true x = print_string x; true;;
(* just prints "4" *)
print_true "4" || print_true "5";;
(* prints "45" *)
long_circuit_or (print_true "4") (print_true "5");;
Kevin Ji
  • 10,479
  • 4
  • 40
  • 63
Greg Nisbet
  • 6,710
  • 3
  • 25
  • 65