5

Do the map functions (Seq.map, List.map etc) have an implicit post-condition that the output has the same number of items as the input? Going further, if we had some kind of Tree.map function, is there an assumption that the "shape" of the input and output trees are the same?

The reason that I ask is that I had always made such an assumption (and I suspect that a lot of code that maps over sequences does too), but then I discovered that Set.map can return a smaller set if the mapping function produces duplicates. So either my assumption is invalid, or Set should not be treated as a sequence for mapping purposes. Which is it?

Guy Coder
  • 24,501
  • 8
  • 71
  • 136
Akash
  • 2,311
  • 1
  • 20
  • 37

5 Answers5

4

Many points of view, here is mine:

We can think of all map functions/methods as specific cases of Haskell's fmap of Functors. From that definition we can assume that the structure will be preserved (plus some other interesting properties).

But in .NET there are no Typeclasses so we can define map over 'restricted Functors', the consequence is some Functor properties will not be preserved, but since there is no generic code that will be affected the impact is limited.

So nothing stops us to define map over:

  • Sets (restriction: the function must be 'a->'b when 'a and 'b: comparison and should be injective)
  • Strings (restriction: the function should be char->char)
  • Nullables (restriction: the function should be 'a->'b where 'b is not a reference type)

Notice that in some cases there are restrictions at both type and value level, for example for sets the restriction at type level is both types 'a and 'b should have comparison while the restriction over the function value is that the function should be injective.

If the language is able to express the type level constraints the compiler will throw an error when these requirements are not met.

For function values there are no compile-time restrictions though we can create unit tests if we want to ensure they are correct. But what would happen if we don't care about constraining those functions?

Well, as long as we understand that some Functor properties will not be obeyed there's nothing wrong in using a map over a restricted Functor.

So we can define a map over structures like sorted lists, of course we can't assume that map a >> map b will be always equivalent to map (a >> b) in these cases. The restriction here is that the function should be monotonically increasing.

NOTE: For Haskell there is a package with a restricted functor and an instance for Sets

Gus
  • 25,839
  • 2
  • 51
  • 76
  • All of the answers have really helped, but I think this one is the clearest. I was definitely getting confused by equating the F# map functions with the Haskell functor map. – Akash Nov 27 '14 at 12:51
3

Yes, I'd expect a map function to respect the structure of the input (though many implementations probably wouldn't have an explicit test).

In the case of Set.map, one could argue that given implementation of (parametric) map itself is correct, but the argument function has to be injective for the overall mapping function to be structure-preserving. So really, for sets, it's a combined property of the 2 functions.

It would be easy to wrap Set.map with some validation that tests for injectivity of the argument function as it gets applied.

Mau
  • 14,234
  • 2
  • 31
  • 52
3

Sets are a bit tricky, because you can only create set of things for which you can test whether they are equal. In fact, F# sets are actually represented as trees, so you need to be able to compare them.

This also means that the map function for sets is not the same as the map function for lists:

List.map : ('a -> 'b) -> 'a list -> 'b list
Set.map  : ('a -> 'b) -> Set<'a> -> Set<'b>) when 'a : comparison and 'b : comparison

The fact that you have to be able to compare the 'b values explains why the map function on sets can do more than a regular map function on lists and sequences. So, it is not a normal map operation!

(Of course, there are other possible ways to break this in F# - the map function could return an empty list - but then the inferred type of the result would be 'c list so that would also be different kind of map).

Tomas Petricek
  • 240,744
  • 19
  • 378
  • 553
  • Ah, the function signatures are helpful! All of which made me think of Functors...and it turns out that Sets are not Functors (http://stackoverflow.com/a/19192745/32413) – Akash Nov 25 '14 at 16:44
2

Yes, generally you'd expect map to be shape-preserving (and therefore size-preserving). But for sets, this obviously cannot hold in general since sets must obey some additional laws (like there are no duplicate elements - so Set.map (f : X -> bool) will clearly not preserve the size of the set if it's applied to a set with more than two elements in it).

kvb
  • 54,864
  • 2
  • 91
  • 133
2

The term map stems from mathematics and simply refers to a function. In itself, it doesn't make a statement how the results are represented. I would assume that the answer depends on the type of which the contents are being mapped.

So either my assumption is invalid, or Set should not be treated as a sequence for mapping purposes. Which is it?

I'd say that the assumption is not valid in general, but should hold true where it can be reasonably applied. For example, if a tree requires a structure that depends on the contained values, and one such tree is mapped to another, it can be impossible to create a valid result that maintains the tree's structure.

However, you can treat a set as a sequence for mapping purposes, just like anything convertible to IEnumerable<>. Simply use Seq.map instead of Set.map.

Vandroiy
  • 6,163
  • 1
  • 19
  • 28
  • You *can* treat a set as a sequence for mapping purposes; I'm questioning whether that is actually a sensible thing to do. It seems that "any `IEnumerable<>` is mappable" together with "Set is an `IEnumerable<>`" is an unfortunate combination. – Akash Nov 25 '14 at 16:49
  • 1
    @Akash I don't see this as a problem. The respective map functions have different meaning, and the results have different types. `map` isn't meaningful in itself; it just means "apply function". However, "map the elements of this set to form another set" is meaningful, as is "map this sequence to a new sequence". They just follow different rules and have a different type as result. It is, and should be, a conscious choice which to use. – Vandroiy Nov 25 '14 at 17:55