60

Consider the Object-Oriented Languages:

Most people coming from an object-oriented programming background, are familiar with the common and intuitive interfaces in various languages that capture the essence of Java's Collection & List interfaces. Collection refers to a collection of objects which doesn't necessarily have an natural ordering/indexing. A List is a collection which has a natural ordering/indexing. These interfaces abstract many library data-structures in Java, as do their equivalent interfaces in other languages, and an intimate understanding of these interfaces are required to work effectively with most library data-structures.

Transition to Haskell:

Haskell has a type-class system which acts on types analogously to interfaces on objects. Haskell seems to have a well designed type-class hierarchy with regard to Functors, Applicative, Monads, etc. when the type regard functionality. They obviously want correct and well-abstracted type-classes. Yet when you look at many Haskell's containers (List,Map,Sequence,Set,Vector) they almost all have very similar (or identical) functions, yet aren't abstracted through type-classes.

Some Examples:

  • null for testing "emptyness"
  • length/size for element count
  • elem/member for set inclusion
  • empty and/or singleton for default construction
  • union for set union
  • (\\)/diff for set difference
  • (!)/(!!) for unsafe indexing (partial function)
  • (!?)/lookup for safe indexing (total function)

If I want to use any of the functions above, but I have imported two or more containers I have to start hiding functions from the imported modules, or explicitly import only the necessary functions from the modules, or qualifying the imported modules. But since all the functions provide the same logical functionality, it just seems like a hassle. If the functions were defined from type-classes, and not separately in each module, the compiler's type inference mechanics could resolve this. It would also make switching underlying containers simple as long as they shared the type-classes (ie: lets just use a Sequence instead of List for better random access efficiency).

Why doesn't Haskell have a Collection and/or Indexable type-class(es) to unify & generalize some of these functions?

recursion.ninja
  • 5,377
  • 7
  • 46
  • 78
  • 10
    [Collection](http://hackage.haskell.org/package/EdisonAPI), [Indexable](http://hackage.haskell.org/package/keys) – Daniel Wagner Aug 07 '14 at 20:40
  • 6
    see http://stackoverflow.com/a/8484117/257418 – Lynn Aug 07 '14 at 20:46
  • @DanielWagner Sure, but none of the library containers are instances of these type-classes. Isn’t terribly usable out-of-the-box... – recursion.ninja Aug 07 '14 at 20:47
  • 3
    To comment briefly on my close vote: since there evidently *are* libraries providing the typeclass in question, it seems to me the most charitable way to interpret this question might be "Why aren't people using these typeclasses?". I believe it's difficult to answer this question in an objective, useful way. – Daniel Wagner Aug 07 '14 at 20:48
  • 3
    Some food for the mind: How would a library handle additional constraints? Compare `isMember :: Ord k => k -> Set k -> Bool` vs `isMember :: a -> [a] -> Bool`. Or indexing: `at :: Int -> [a] -> Maybe a` vs `at :: Unbox a => Int -> Vector a -> Maybe a` (for unboxed vectors). Other than that, I concur with Daniel, it's hard to answer in an objective way. If you can create your specific version of `Collection`, go for it, and add it to hackage. – Zeta Aug 07 '14 at 20:49
  • 2
    @awashburn: I wouldn't recommend `Collection`, but `Indexable` works out-of-the-box allright and has all the instances you asked for. And of course it's much more sophisticated than anything you could possibly write in Java... – leftaroundabout Aug 07 '14 at 20:50
  • @Zeta `elem :: (Eq a) => a -> Collection a -> Bool` doesn't seem that impossible to abstract by making `Map` & `List` instances of the fictional `Collection` type-class. – recursion.ninja Aug 07 '14 at 20:54
  • @awashburn: You probalby mean `elem :: (Eq a, Collection c) => a -> c a -> Bool`. However, in this case you can't use the logarithmic `member` search of `Map` and `Set`, since both need `Ord a` to find an element efficiently. That way you're loosing a great part of the performance (O(log n) -> O(n)). – Zeta Aug 07 '14 at 20:59
  • 3
    @awashburn: { ... duplicate of Zeta's comment, removed ... } That said, an efficient generic `elem` would actually be possible with `ConstraintKinds`. – leftaroundabout Aug 07 '14 at 21:01
  • 1
    Minor note: you can test for emptiness using a `Monoid` and `Eq` constraint: `null :: (Monoid m, Eq m) => m -> Bool ; null x = x == mempty` – Gabriella Gonzalez Aug 07 '14 at 21:45
  • Sequence has not better random access; Sequence from Data.Sequence is a DeQue (a double ended queue) with constant access from both heads but accessing in between has O(log(min(i,n-i))) as [documented](http://www.haskell.org/ghc/docs/latest/html/libraries/containers/Data-Sequence.html#v:index) – Gabriel Riba Aug 07 '14 at 22:37
  • @GabrielRiba **`O(log(min(i,n-i))`** is better then **`O(n)`**, right? – recursion.ninja Aug 07 '14 at 22:50
  • Indeed, finger trees are nice like that; of course, you pay for that flexibility with larger constant factors. So if you *only* need random access, or *only* need access at one/both ends, you should use a different structure. – dfeuer Aug 07 '14 at 23:53
  • @leftaroundabout I like the [**`Indexable`**](http://hackage.haskell.org/package/keys) type-class you guys pointed out. It's just sad that it is built on top of the containers in another module, rather then with the containers in their respective modules. The type-class consequently has poor viability for Haskell programmers, making it harder to adopt it as a standardization... – recursion.ninja Aug 08 '14 at 17:13
  • 1
    Take a look at [IsSequence](http://hackage.haskell.org/package/mono-traversable-0.6.1/docs/Data-Sequences.html) and other [containers](http://hackage.haskell.org/package/mono-traversable-0.6.1/docs/Data-Containers.html) in [mono-traversable](http://hackage.haskell.org/package/mono-traversable) – Greg Weber Sep 27 '14 at 02:41

7 Answers7

42

As other answers have pointed out, Haskell tends to use different vocabulary. However, I don't think they've explained the reason for the difference very well.

In a language like Java, functions are not "first class citizens"; it's true that anonymous functions are available in the latest versions, but this style of interface (Collection, Indexable, Interable, etc.) were designed before that.

This makes it tedious to pass our code around, so we prefer other people's data to be passed to our code. For example:

  • Data implementing Java's Iterable lets us write for (Foo x : anIterable) { ... }
  • Data implementing PHP's ArrayAccess lets us write anArrayAccess[anIndex]

This style can also be seen in OO languages which implement generators, since that's another way for us to write for yieldedElement in aGenerator: ....

Haskell takes a different approach with its typeclasses: we prefer our code to be passed to other people's data. Some (simplified) examples:

  • Functors accept our code and apply it to any elements they 'contain'
  • Monads accept our code and apply it in some kind of 'sequence'
  • Foldables accept our code and use it to 'reduce' their contents

Java only needs Iterable since we have to call our code in our for loop, so we can make sure it's called correctly. Haskell requires more specific typeclasses since someone else's code will be calling ours, so we need to specify how it should be called; is it a map, a fold, an unfold, etc.?

Thankfully, the type system helps us choose the right method ;)

Warbo
  • 2,611
  • 1
  • 29
  • 23
  • Cool answer - I believe it gives the most fundamental reason for the lack of "data interface" typeclass. But it also invites us to look data access problems the "other way around". – Titou Aug 21 '15 at 15:03
37

The lens package provides some of this.

  • Testing for emptiness, creating empty containers These are both provided by the AsEmpty typeclass from Control.Lens.Empty.

  • Accessing elements by key/index. The At and Ixed typeclasses from Control.Lens.At.

  • Checking for membership in set-like containers. The Contains typeclass from Control.Lens.At.

  • Appending and deleting elements to sequence-like containers. The Cons and Snoc typeclasses from Control.Lens.Cons.

Also, the pure method of the Applicative typeclass can often be used to create "singleton" containers. For things that are not functors/applicatives in Haskell, like Set, perhaps point from Data.Pointed could be used.

danidiaz
  • 26,936
  • 4
  • 45
  • 95
  • 4
    I really think that many of lens's typeclasses are solutions to Haskell's module problems, including "programming against interfaces" like in this example. – Tikhon Jelvis Aug 07 '14 at 23:29
28

Haskell has some type classes for working with collections in the base package: Functor, Foldable and Traversable can be useful for working with collections, and the Monoid, Applicative and/or Alternative typeclasses can be useful for constructing collections.

Together, these classes cover most of the operations mentioned in the question, but maybe less efficient than a more container-specific function (though many of these are class methods, whose default definitions can be overriden if necessary).

null for testing "emptyness"

Foldable supports null since base 4.8 (any (const True) is an alternative for earlier versions).

length/size for element count:

Foldable supports length since base 4.8 (getSum . foldMap (const 1) is an alternative for earlier versions).

elem/member for set inclusion

Foldable supports elem, notElem and member.

empty and/or singleton for default construction

For empty, there is mempty from Monoid and empty from Alternative. For singleton, there is pure from Applicative.

union for set union

There is mappend from Monoid and <|> from Alternative. They don't necessarily implement set union, but they implement some form of union that works well together with empty and usually also with singleton and find.

(\)/diff for set difference

This one is not supported, unfortunately.

(!)/(!!) for unsafe indexing (partial function)

You could use fromJust together with a function for safe indexing.

(!?)/lookup for safe indexing (total function)

There is find from Foldable.

duplode
  • 33,731
  • 7
  • 79
  • 150
Toxaris
  • 7,156
  • 1
  • 21
  • 37
  • 4
    Does `elem` from `Foldable` actually use the internal structure of the container to speed up the check? If not, using it instead of container-specific functions could hurt performance. – danidiaz Aug 07 '14 at 21:44
  • Good point, it looks like `elem` just scans through the container. – Toxaris Aug 07 '14 at 22:11
  • 2
    I feel that, while functional, these suggestions don't correctly capture the intended abstractions. The *intention* in many of your examples are not readily deductible... The info from [**`Foldable`**](http://hackage.haskell.org/package/base-4.7.0.1/docs/Data-Foldable.html) is interesting though! – recursion.ninja Aug 07 '14 at 22:13
  • 1
    You can think of `Foldable` as a fancy way of saying there's a function `Foldable f => f a -> [a]`---so then all instantiators inherit a lot of functionality from `List`. – J. Abrahamson Aug 08 '14 at 02:16
  • 2
    a minor improvement on your `length`: `getSum . foldMap (const 1)` – cdk Aug 08 '14 at 12:46
  • 2
    With more recent versions of *base* (>= 4.8), `null` and `length` are methods of `Foldable`. – duplode Dec 17 '16 at 05:51
25

Partly, the reason is that monads and arrows are new, innovative features of Haskell, while collections are relatively more mundane. Haskell has a long history as a research language; interesting research questions (designing monad instances & defining generic operations for monads) get more development effort than "industrial-strength" polishing (defining container APIs).

Partly, the reason is that those types come from three different packages (base, containers, and vector), with three separate histories and designers. That makes it harder for their designers to coordinate on providing instances of any single type class.

Partly, the reason is that defining a single type class to cover all five of the containers you mentioned is really hard. List, Sequence, and Vector are relatively similar, but Map and Set have completely different constraints. For List, Sequence, and Vector, you want a simple constructor class, but for Set that won't work, since Set requires an Ord instance on the element type. Even worse, Map can support most of your methods, but its singleton function needs two parameters where the rest need only one.

Jonathan Cast
  • 4,569
  • 19
  • 34
13

Such typeclasses exist in standard Haskell, but they don't have the same name as their equivalent OO counterparts. The Collection typeclass, for example, is called Foldable in Haskell. You can use it to test if a structure is empty (foldr (const False) True x) or to count the number of elements (foldMap (const 1) x), or to test for set membership (foldr (\e' present -> (e==e') || present) False x for some e).

For operations like element lookup, you have the Array typeclass which might work for sequential data. For more flexibility, you can write your own Indexable class, like this for example (beware of lenses) :

class Indexable m k a where
  at :: k -> Lens' m (Maybe a)

The null element and set union belong to the Monoid typeclass (where mappend == union). In this light, set difference could also be implemented in its own typeclass Differentiable (which I'm sure already exists in dozens of Haskell libraries) and we would have total compatibility with imperative languages.

Haskell, by virtue of being designed by mathematicians and the like, doesn't employ the same vocabulary as most other languages, but rest assured, it doesn't mean that it's not a practical language in addition to being an awesome one :-)

svick
  • 236,525
  • 50
  • 385
  • 514
user3340662
  • 191
  • 4
12

Laws. A good typeclass has laws. A great typeclass has enough parametricity so that it's laws are "theorems for free". A typeclass without laws is just ad-hoc name overloading.

Also, check out classy-prelude and Edison-API.

9

You have typeclasses for different collection aspects:

  1. composition: Monoid (module Data.Monoid)

  2. sequential control: Applicative, Monad (modules Control.Applicative, Control.Monad)

  3. sequential composition: Alternative, MonadPlus (modules Control.Applicative, Control.Monad)

  4. non-sequential mapping and reduction: Functor (mod. Data.Functor), Foldable (mod. Data.Foldable)

  5. sequential mapping and reduction: Traversable (module Data.Traversable)

  6. serialisation: Binary (mod. Data.Binary)

  7. comparison: Eq, Ord (mod. Data.Eq, Data.Ord)

  8. textualisation: Show, Read

  9. deep evaluation (to Normal Form): NFData (mod. Control.DeepSeq)

  10. generic datatype traversability: Data (mod. Data.Data)

Except that monomorphic collections (ByteString, IntSet, Text) cannot implement Functor and Foldable (they require type arity == 1 (Kind: * -> *))

Also neither (Set a) implements Functor.

The package mono-traversable redefines some classes without the monomorphic types exclusion.

Update. There is an attempt to put most functions in typeclasses with the packages mono-traversable and classy-prelude.

library ref, platform

Gabriel Riba
  • 6,698
  • 2
  • 17
  • 19