What is a "multisorted algebra", and how do I use it to solve "real problems"?

Question

Apparently, Alexander Stepanov has stated the following in an interview:

“I find OOP [object-oriented programming] technically unsound. It attempts to decompose the world in terms of interfaces that vary on a single type. To deal with the real problems you need multisorted algebras - families of interfaces that span multiple types.” [Emphasis added.]

Ignoring his statement regarding OOP for a moment, what are "multisorted algebras", beyond his terse definition, and can you give a practical example of how they are used (in the language of your choice)?

@PeterO. It is on-topic because it is about how to apply a concept of computer science. It is a very good question. — Kuba hasn't forgotten Monica, Feb 28 '14 at 15:03

score 23 · Answer 1 · edited May 23 '17 at 12:25

I believe he was talking about generic programming (he coined the term), whether meant in the context of this talk about the STL, or 'at large', in the sense of:

programming against a sort of interface that describes something that could fit all (and hopefully several) types (hence multi-sorted), ...
... provided they have some properties, often something about the nature of some operations on elements of that type (hence algebras).

To do (1), you need to have a way to specify a program that takes a type as a parameter, i.e. polymorphism, and to do (2), you need a way to say that you also want that type to carry specific operations (and, provided you can express them, properties). In effect, you're parametrizing your program by the structure of the data it manipulates. The paradigm is called in some places bounded polymorphism, datatype-generic programming, ... which reflects that languages have different notions of how to implement that idea — hence the italicized 'sort of' above.

For C++, it seems that —to Stepanov at least— this corresponds to templates (though ideas on how to do this best are still evolving).
For OO languages (Generic Java, C#), constraints on type parameters are typically expressed using subtype bounds ('bounded wildcards' ...).
For Haskell or Scala, you have (respectively, and similarly) type classes or implicits.
The ML family of languages prefers to do this using modules.
note that a number of proof assistants (which can express 'honest-to-god' properties as types) have developed a flavor of type classes : Isabelle, Coq, Matita are such examples

Note that Stepanov just co-wrote an entire book giving an exhaustive development of a library that embodies exactly what (I think) he means. So if you want examples in C++, this is definitely where you should look. Note also that this is much more evolved than the now-common advice of coding against an interface, rather than an object.

By 'practical example', I don't know if you mean 'how' or 'why' does one uses it. To give a caricaturally quick answer to the 'why', genericity is nice because, a bit like run-of-the-mill polymorphism, it lets you reuse code. But, more importantly:

polymorphic code that has to work with every single type often can't do anything interesting, whereas having a constrained interface to play with allows you to write richer programs
by specifying how that interface fits some your data, you have a type-safe way to select just those elements that suit your needs. For example, you probably know that the reduction operator (the reduce of Python & Hadoop, fold of a bunch of functional languages) is parallelizable only if the order in which you apply your reduction function doesn't matter (+, x, min, and work, but set difference doesn't). If you have a notion of 'type equipped with an associative operation', you know that you will be able to call a parallel reduction on it.
any overhead incurred by genericity occurs at compile time. For example, templates are legendarily fast

If you have seen some generic Java, look at say, the Comparable generic interface. It defines just one operation, but the contract it makes, though basic, is very much of algebraic flavor. I quote:

For the mathematically inclined, the relation that defines the natural ordering on a given class C is:
  {(x, y) such that x.compareTo((Object)y) <= 0}.
The quotient for this total order is:
  {(x, y) such that x.compareTo((Object)y) == 0}.
It follows immediately from the contract for compareTo that the quotient is an equivalence relation on C, > and that the natural ordering is a total order on C.

Now, I can write a method that selects the minimum, once, and use it for any type that fits this interface:

 public static <T extends Comparable<T>> T min (T x, T y) {
   if (x.compare(y) < 0) x; else y;
 }

Naturally, since the way programmative constructs implement that notion varies wildly, what you will get in terms of usability & expressivity will also vary. Perhaps you should not judge data-generic programming just by OO languages like C++ or Java — but I've written too much already to start with module ascription or the automatic instance generation of type classes.

Slight nitpicking: String concatenation *is* associative and parallelizable. On the other hand, it is not commutative, so when you need to give each parallel processing unit contiguous slices of the original list of strings you want to concatenate. — isekaijin, Nov 16 '14 at 01:59
As Stepanov observes, parallelizing a reduction operation only requires associativity. But reordering the list to be reduced (equivalently, treating it as a set rather than as a sequence) requires commutativity. — isekaijin, Nov 16 '14 at 02:01

Oleksandr Karaberov · Answer 2 · 2015-09-26T21:48:28.103

I'm too late, but maybe it will be helpful for you. User huitseeker wrote an excellent answer from the viewpoint of software design. I want to answer your question from the viewpoint of mathematics. Before diving into software world Alex Stepanov was a mathematician and studied abstract and universal algebra. And he often tried to bring rigorous mathematical foundations into the world of software and algorithm design. In his books From Mathematics to Generic Programming and Elements of Programming he advocates this design practice. His ideas about mixing concepts of algebraic structures and software design were realised in the notion of generic programming. And now let's talk about his quote:

To deal with the real problems you need multisorted algebras - families of interfaces that span multiple types

In my opinion there are two main concepts he wanted to mention here: the idea of abstract data type (ADT) and algebraic structure. First concept: ADT. ADT - is a mathematical model for a data types where a data type is defined only by it's semantic. Stepanov contrasted the idea of ADT to the idea of object in the OOP sense. Objects contains data and state whilst ADTs - not. ADT - is a behavioural abstraction, an operation cluster which describes interaction with data. Behavioural abstraction is entirely described by means of algebraic specification of abstract data type. You can read about this more in the original Liskov and Zilles paper, also I recommend you a paper Object-Oriented Programming Versus Abstract Data Types by William R. Cook.

(Discalimer: you can skip this paragraph, because it is more "mathematical and not so important") At first I want to clarify some terminology. When I talk about the algebraic structure it is the same as algebra. The word algebra is often also used for an algebraic structure. To be more precise when we talk about algebraic structures (algebras) we usually mean algebra over an algebraic theory. There is a concept of the variety of algebras, because there are several notions of an algebraic structure on an object of some category. By definition, an algebraic theory (algebra over it) consists of a specification of operations and laws that these operations must satisfy: this is a working definition of the algebraic structure we will use, and this definition ,I think, Stepanov implicitly mentioned in the quote.

Second concept which Stepanov wanted to mention is the most interesting property of ADTs: they can be formally modelled directly as many-sorted algebraic structures. Let's talk about it more formally. An algebraic structure - is a carrier set with one or more finitary operations defined on it. These operations are usually defined not over one set but over the multiple ones. E.g. let's define and algebra which models string concatenation. This algebra will be defined not over one set of strings but over two sets: strings set S and natural numbers set N, because we can define an operation which can concatenate a string with itself some finite number of times. So, this operation will take two operands, which belongs to different underlying (carrier) sets: S and N. Set which define these different operands (their types) in algebra called a set of sorts. Sort is an algebraical analog of the type. Algebra with multiple sorts called a multi-sorted algebra. In universal algebra, a signature lists the operations that characterize an algebraic structure. A many-sorted algebraic structure can have an arbitrary number of domains. The sorts are part of the signature, and they play the role of names for the different domains. Many-sorted signatures also prescribe on which sorts the functions and relations of a many-sorted algebraic structure are defined. For a one-sorted variety of algebras a signature is a set, whose elements are called operations, to each of which is assigned a cardinal number (0,1,2,…) called its arity. A signature of multi-sorted algebra can be defined as Σ = (S,OP,A), where S – set of sort names (types), OP - set of operation names and A - arities as before, except that now an arity is a list (sequence or more generally free monoid) of input sorts instead of merely a natural number (the length of the list) together with one output sort. Now we can create an algebraic specification of an abstract data type ADT as a triple:

ADT = (N, Σ, E)

, where N - name of abstract data type, Σ = (S,OP,A) - signature of multi-sorted algebraic structure, E = {e1, e2, …,en} - is a finite collection of equalities in the signature. As can you see now we have a rigorous mathematical description of ADT. In mathematics many-sorted algebraic structures are often used as a convenient tool even when they could be avoided with a little effort. Many-sorted algebraic structures are rarely defined in a rigorous way, because it is straightforward to carry out the generalization explicitly. That's why theory of many-sorted algebras can be successfully applied to software design.

So, Alex Stepanov wanted to say that he prefer ADTs and generic programming to OOP, because thus we can create programs with rigorous mathematical/algebraical foundations. I appreciate his efforts a lot. We all know that algebraical design is always correct, rigorous, beautiful, simple and gives us better abstractions.

score 0 · Answer 3 · answered May 01 '18 at 08:45

Not that I am an expert with the theory of any of those but let's take a look at the quote so that I can try to give my practical understanding to add to the discussion.

To deal with the real problems you need multisorted algebras - families of interfaces that span multiple types.

From my readings, I think families of interfaces that span multiple types sounds a lot like type classes from Haskell, which is similar to concepts from C++. Take a type class like Foldable it actually is a type parametrized interface, ie. a familiy of interfaces that span multiple types. So about your question of how to solve problems with multisorted algebras, generic programming is all about that if you take it to mean type classes or concepts.

What is a "multisorted algebra", and how do I use it to solve "real problems"?

3 Answers3