15

Consider a function choose in Curry programming language with the specification that "(choose xs) non-deterministically chooses one element from the list xs".

I'd implement it straighforwardly through two alternative non-deterministic rules:

choose :: [a] -> a
choose x:_ = x
choose _:xs = choose xs

But in /usr/lib/curry-0.9.11/Success.curry from Muenster Curry Compiler, it is defined with a helper function:

choose (x:xs) = choosep x xs
  where choosep x [] = x
        choosep x (_:_) = x
        choosep _ (x:xs) = choosep x xs

What could be the advantages (if any) of the definition from the compiler supplied module? Are the 2 definitions completely equivalent (even in some tricky cases with non-determinism and undefined values)?.. Is one of them more efficient in some cases?

ADDED: Deeper consideration

cthom06 (Thanks!) has correctly pointed out that my definition would cause hitting the undefined value in much more cases (because we would try to call this function with an empty-list argument once per each our "top-level" call with an initially non-empty list argument). (Hm, why haven't I noticed this consideration right away?..) That is less efficient.

But I wonder: are there any semantic differences? May the difference be important in some tricky contexts?

We see that the difference between the two definitions--in the case of non-empty lists--basically boils down to the difference between two potential definitions for id:

my definition is like defining id as:

id x = x
id _ = undefined

and their definition is like definig id the normal way:

id x = x

(So, here the straightforwardness is reverted.)

In which contexts can it be important?

imz -- Ivan Zakharyaschev
  • 4,921
  • 6
  • 53
  • 104
  • 2
    I may not be understanding correctly, but in your implementation it looks like you can end up getting stuck trying to choose a value from an empty list even when you originally pass it a non-empty list. – cthom06 Mar 01 '11 at 13:30
  • 2
    @cthom06: "getting stuck" in this manner is not changing the semantics AFAIU: that variant of the computation is simply discarded then. But perhaps it affects the efficiency a bit. So, the question boils down to: what are the big differences (w.r.t. semantics--any tricky cases, or w.r.t. efficiency) of defining `id` with one rule `id x = x` or with an extra bogus rule: `id x = x` and `id _ = undefined`? I wonder: are there tricky cases when the semantics can change? – imz -- Ivan Zakharyaschev Mar 01 '11 at 13:55
  • @cthom06, why not make that an answer? – Ben Mar 11 '11 at 13:11
  • @Ben, I was really just hazarding a guess based on what I saw, I've never used Curry and only done a little Haskell. – cthom06 Mar 11 '11 at 13:24
  • BTW, [is the "p" suffix in `choosep` some naming convention for helper functions?](http://stackoverflow.com/q/5279286/94687) – imz -- Ivan Zakharyaschev Mar 11 '11 at 23:01
  • I think `id x = x` and `id _ = undefined` will lead to a not-deterministic id. Half the times it will give x and the other halve undefined. I am not sure, I don't have the compiler here. But it feels really awkward to write. – Edgar Klerks Aug 25 '12 at 21:11
  • try this in the interactive editor: `p x = 1` and `p _ = error "boom"` then solve for a constraint: `p :=: 2` and it will go boom. [curryOnline](http://www-ps.informatik.uni-kiel.de/currywiki/documentation/try_curry) – Edgar Klerks Aug 25 '12 at 21:31
  • @Edgar: in `cyi` from Muenster Curry Compiler, if I enter `p :=: 2`, I get ":=: is undefined". What is `:=:`? – imz -- Ivan Zakharyaschev Aug 26 '12 at 03:57
  • @imz Sorry that was a typo. With the =:= you make an equation. Curry will the try to solve it. For example `((x || y) && x) =:= True where x, y free` will give all x,y which conforms to the equation. (x || y) && x == True – Edgar Klerks Aug 26 '12 at 06:54
  • @Edgar: then there's another typo as well: the left-hand side is of type _ -> Int, but the right-hand side is of type Int in your equation. – imz -- Ivan Zakharyaschev Aug 26 '12 at 10:22
  • @Edgar: I see your point. But that's not an essentail difference: printing errors is just a side effect, not changing the sematics of the expressions. The constraint is not solved. – imz -- Ivan Zakharyaschev Aug 26 '12 at 10:24
  • @imz I could only use the online version, so I had to force the expression to 1 solution somehow. No, my point is that `id x = x` is different from: `id x = x` and `id _ = undefined`. This gives me for `id 1` 1 and then an error. The semantics are changed, because if I rearrange the two equations to: `id _ = undefined` and `id x = x`. I never get a result from the second equation. `id 1` gives just an error. `undefined = error "undefined"`. On kics2 there is no undefined. – Edgar Klerks Aug 27 '12 at 08:20
  • @Edgar: Ok, let me summarize. The difference is in the (number of) intermediate solutions, in how the interpreter works, in side-effects, but probably not in the fully computed result unless there was some encapsulated search. Also, laziness plays a role, and perhaps there can be differences in (the number of) output solutions, if we take only the "head" of an expression without hitting "undefined"/error. But IIUC your example doesn't show that different results are printed out for different definitions of `id`. They only differ in the error messages,but this can't count as a semantic diff – imz -- Ivan Zakharyaschev Aug 27 '12 at 12:38
  • @imz I am not sure about it is only printing an error message. If it is modelled after haskell, error gives an actual value (bottom). If that is true than`id x = x` differs from `id x = x and id _ = undefined`. – Edgar Klerks Aug 27 '12 at 14:14
  • @Edgar: aha,I see.Yes,thinking in terms of the bottom value can be helpful here. But what does the bottom value become under Curry's generelized view?And how does it "behave" in Curry when it arises near another value? 1st approach would be to view the bottom as no solutions in Curry (empty set of non-bottom values),whereas other sets of non-bottom values are the other possible results in Curry. Then in case of nondeterminism the result must be a union, and bottom "disappears" if we have a union of bottom and another set. Is the sematics like that? 2nd: results are sets of values incl. bottom. – imz -- Ivan Zakharyaschev Aug 28 '12 at 04:07
  • @imz I think the first would be most reasonable. There exists a no-solutions symbol (failed). For error the documentation explicitly states that execution is halted ([ref](http://www-ps.informatik.uni-kiel.de/kics2/lib/CDOC/Prelude.html#error)). While failed is said to expres failure in the search tree. So error and failed are treated differently. If we define undefined, then we should define it as `undefined = failed` then there is no difference, between the two definitions of `id`. So you are right, because I had the definition of undefined wrong. `error` is something special. Thanks fyi. – Edgar Klerks Aug 28 '12 at 12:19
  • Here it is stated, that bottom represents a failure: [kics2_paper](http://www.informatik.uni-kiel.de/~mh/papers/WFLP11_KiCS2.pdf). So `head (a : xs) = a` behaves as `head (x:xs) = x` and `head [] = failed`. But how should we interpret `error`? – Edgar Klerks Aug 28 '12 at 12:37

1 Answers1

2

I believe there are no semantic differences, but the one with the helper function is more efficient, particularly in the common case (in certain styles of programming) of a list with one element. In this case a choice point is avoided which with your version would need to be set up to call recursively with [] which is then bound to fail.

A smarter optimizer might figure this out for itself, but handling all kinds of similar situations is likely to be challenging - the compiler would need to try to specialize applications for each of the possible constructors in a datatype, remove those where failure always occurs, and remove choice points when only one possibility remains.

RD1
  • 3,305
  • 19
  • 28
  • My other comment applies here, too: The difference is in the (number of) intermediate solutions, in how the interpreter works, maybe in side-effects (think of "error" as a side-effect), but probably not in the fully computed result unless there was some encapsulated search. Also, laziness plays a role, and perhaps there can be differences in (the number of) output solutions, if we take only the "head" of an expression without hitting "undefined"/error. – imz -- Ivan Zakharyaschev Aug 27 '12 at 12:41