18

What is the precise promise/guarantee the Haskell language provides with respect to referential transparency? At least the Haskell report does not mention this notion.

Consider the expression

(7^7^7`mod`5`mod`2)

And I want to know whether or not this expression is 1. For my safety, I will do perform this twice:

( (7^7^7`mod`5`mod`2)==1, [False,True]!!(7^7^7`mod`5`mod`2) )

which now gives (True,False) with GHCi 7.4.1.

Evidently, this expression is now referentially opaque. How can I tell whether or not a program is subject to such behavior? I can inundate the program with :: all over but that does not make it very readable. Is there any other class of Haskell programs in between that I miss? That is between a fully annotated and an unannotated one?

(Apart from the only somewhat related question I found on SO there must be something else on this)

Community
  • 1
  • 1
false
  • 10,264
  • 13
  • 101
  • 209
  • When I run it I get `(True, False)`, with ghci 7.4.2. – ely Nov 19 '14 at 15:00
  • @bheklilr: updated to reflect version number, but I doubt this is version specifc. – false Nov 19 '14 at 15:01
  • @false It seems to be, I'm using 7.8.3, and copy/pasting your exact code into GHCi gives me `(True, True)`. My guess to what is happening is that for some reason in GHC 7.4.2, the second one is getting parsed as ```((7^7^7) `mod` (5 `mod` 2))```, although I would think that it affect both sides. It's possible that there was some bug in the parser, but that's hard for me to determine without digging through the bug tracker or installing 7.4 – bheklilr Nov 19 '14 at 15:06
  • @bheklilr: I suspect type inference, but maybe I a wrong. – false Nov 19 '14 at 15:07
  • I don't know if it's helpful, but if you first ``let f x = ((7^7^7 `mod` 5) `mod` x)`` and then do `( (f 2)==1, [False,True]!!(f 2) )` it still gives `(True, False)` on ghci 7.4.2. I was thinking maybe somehow strictness of `(!!)` caused something weird, but this suggests it is not parsing or strictness. – ely Nov 19 '14 at 15:22
  • 3
    Couldn't this have to to with defaulting? `(!!)` infers `Int`, and trying to experiment with that gives me some rather weird results: ```(-3568518334133427593 `mod` 5 `mod` 2) :: Int == -1```, ```(-3568518334133427593 `mod` 5 :: Int) `mod` 2 == 1```, and ```(-3568518334133427593 :: Int) `mod` 5 `mod` 2 == 0``` (in ghci 7.6.3). – phipsgabler Nov 19 '14 at 15:29
  • 10
    Use `-fwarn-type-defaults` so you can see when the compiler has used defaulting to pick a type for you. You can also add the line `default ()` to your module to stop all defaulting. Then you will be forced to give the type of the first expression, and you'll see that `Int` vs `Integer` makes a difference. – augustss Nov 19 '14 at 16:03
  • 1
    -1 since I think this sounds like a tendentious polemic argument and not a real question. – Jonathan Cast Nov 19 '14 at 16:10
  • 3
    @bheklilr I expect the breakdown is 32-bit vs 64-bit architecture, not old compiler vs new compiler. Remember: the size of an `Int` isn't defined (except to demand at least 29 bits or some weird number like that). Try `7^7^7 \`mod\` 5 \`mod\` 2 :: Int32` in ghci, and similarly for `Int64`. – Daniel Wagner Nov 19 '14 at 18:22
  • @DanielWagner I definitely see that now, I was just trying to figure out why I got a different answer in 7.8 than in 7.4! – bheklilr Nov 19 '14 at 19:03
  • @bheklilr Perhaps you have a 32-bit build of 7.4 installed or something like that. – Daniel Wagner Nov 19 '14 at 19:19

7 Answers7

20

I do not think there's any guarantee that evaluating a polymorphically typed expression such as 5 at different types will produce "compatible" results, for any reasonable definition of "compatible".

GHCi session:

> class C a where num :: a
> instance C Int    where num = 0
> instance C Double where num = 1
> num + length []  -- length returns an Int
0
> num + 0          -- GHCi defaults to Double for some reason
1.0

This looks as it's breaking referential transparency since length [] and 0 should be equal, but under the hood it's num that's being used at different types.

Also,

> "" == []
True
> [] == [1]
False
> "" == [1]
*** Type error

where one could have expected False in the last line.

So, I think referential transparency only holds when the exact types are specified to resolve polymorphism. An explicit type parameter application à la System F would make it possible to always substitute a variable with its definition without altering the semantics: as far as I understand, GHC internally does exactly this during optimization to ensure that semantics is unaffected. Indeed, GHC Core has explicit type arguments which are passed around.

chi
  • 111,837
  • 3
  • 133
  • 218
18

The problem is overloading, which does indeed sort of violate referential transparency. You have no idea what something like (+) does in Haskell; it depends on the type.

When a numeric type is unconstrained in a Haskell program the compiler uses type defaulting to pick some suitable type. This is for convenience, and usually doesn't lead to any surprises. But in this case it did lead to a surprise. In ghc you can use -fwarn-type-defaults to see when the compiler has used defaulting to pick a type for you. You can also add the line default () to your module to stop all defaulting.

augustss
  • 22,884
  • 5
  • 56
  • 93
15

I thought of something which might help clarify things...

The expression mod (7^7^7) 5 has type Integral a so there are two common ways to convert it to an Int:

  1. Perform all of the arithmetic using Integer operations and types and then convert the result to an Int.
  2. Perform all of the arithmetic using Int operations.

If the expression is used in an Int context Haskell will perform method #2. If you want to force Haskell to use #1 you have to write:

fromInteger (mod (7^7^7) 5)

This will ensure that all of the arithmetic operations will be performed using Integer operations and types.

When you enter he expression at the ghci REPL, defaulting rules typed the expression as an Integer, so method #1 was used. When you use the expression with the !! operator it was typed as an Int, so it was computed via method #2.

My original answer:

In Haskell the evaluation of an expression like

(7^7^7`mod`5`mod`2)

depends entirely on which Integral instance is being used, and this is something that every Haskell programmer learns to accept.

The second thing that every programmer (in any language) has to be aware of is that numeric operations are subject to overflow, underflow, loss of precision, etc. and thereby the laws for arithmetic may not always hold. For instance, x+1 > x is not always true; addition and multiple of real numbers is not always associative; the distributive law does not always hold; etc. When you create an overflowing expression you enter the realm of undefined behavior.

Also, in this particular case there are better ways to go about evaluating this expression which preserves more of our expectation of what the result should be. In particular, if you want to efficiently and accurately compute a^b mod c you should be using the "power mod" algorithm.

Update: Run the following program to see how the choice of Integral instance affects the what an expression evaluates to:

import Data.Int
import Data.Word
import Data.LargeWord -- cabal install largeword

expr :: Integral a => a
expr = (7^e `mod` 5)
  where e = 823543 :: Int

main :: IO ()
main = do
  putStrLn $ "as an Integer: " ++ show (expr :: Integer)
  putStrLn $ "as an Int64:   " ++ show (expr :: Int64)
  putStrLn $ "as an Int:     " ++ show (expr :: Int)
  putStrLn $ "as an Int32:   " ++ show (expr :: Int32)
  putStrLn $ "as an Int16:   " ++ show (expr :: Int16)
  putStrLn $ "as a Word8:    " ++ show (expr :: Word8)
  putStrLn $ "as a Word16:   " ++ show (expr :: Word16)
  putStrLn $ "as a Word32:   " ++ show (expr :: Word32)
  putStrLn $ "as a Word128:  " ++ show (expr :: Word128)
  putStrLn $ "as a Word192:  " ++ show (expr :: Word192)
  putStrLn $ "as a Word224:  " ++ show (expr :: Word224)
  putStrLn $ "as a Word256:  " ++ show (expr :: Word256)

and the output (compiled with GHC 7.8.3 (64-bit):

as an Integer: 3
as an Int64:   2
as an Int:     2
as an Int32:   3
as an Int16:   3
as a Word8:    4
as a Word16:   3
as a Word32:   3
as a Word128:  4
as a Word192:  0
as a Word224:  2
as a Word256:  1
ErikR
  • 51,541
  • 9
  • 73
  • 124
  • 1
    `x+1 > (x::Integer)` is always true, never false. At worst there is a resource error. Your considerations only apply to `Int` (module arithmetics) and floats. – false Nov 19 '14 at 15:54
  • 1
    but the point is that `Int` is an `Integral` type, so `x+1 > x` does _not_ hold for all `Integral` types. The expression in question has type `Integral a`. – ErikR Nov 19 '14 at 16:20
  • 1
    Haskell will never under any circumstances choose method #1 for you. It will either do all the operations as `Integer` and end up with an `Integer` or do all the operations as `Int` and end up with an `Int`. There will be no conversion unless you call conversion functions explicitly, which is writing a different program. – Ben Nov 19 '14 at 20:41
7

What is the precise promise/guarantee the Haskell language provides with respect to referential transparency? At least the Haskell report does not mention this notion.

Haskell does not provide a precise promise or guarantee. There exist many functions like unsafePerformIO or traceShow which are not referentially transparent. The extension called Safe Haskell however provides the following promise:

Referential transparency — Functions in the safe language are deterministic, evaluating them will not cause any side effects. Functions in the IO monad are still allowed and behave as usual. Any pure function though, as according to its type, is guaranteed to indeed be pure. This property allows a user of the safe language to trust the types. This means, for example, that the unsafePerformIO :: IO a -> a function is disallowed in the safe language.

Haskell provides an informal promise outside of this: the Prelude and base libraries tend to be free of side effects and Haskell programmers tend to label things with side effects as such.

Evidently, this expression is now referentially opaque. How can I tell whether or not a program is subject to such behavior? I can inundate the program with :: all over but that does not make it very readable. Is there any other class of Haskell programs in between that I miss? That is between a fully annotated and an unannotated one?

As others have said, the problem emerges from this behavior:

Prelude> ( (7^7^7`mod`5`mod`2)==1, [False,True]!!(7^7^7`mod`5`mod`2) )
(True,False)
Prelude> 7^7^7`mod`5`mod`2 :: Integer
1
Prelude> 7^7^7`mod`5`mod`2 :: Int
0

This happens because 7^7^7 is a huge number (about 700,000 decimal digits) which easily overflows a 64-bit Int type, but the problem will not be reproducible on 32-bit systems:

Prelude> :m + Data.Int
Prelude Data.Int> 7^7^7 :: Int64
-3568518334133427593
Prelude Data.Int> 7^7^7 :: Int32
1602364023
Prelude Data.Int> 7^7^7 :: Int16
8823

If using rem (7^7^7) 5 the remainder for Int64 will be reported as -3 but since -3 is equivalent to +2 modulo 5, mod reports +2.

The Integer answer is used on the left due to the defaulting rules for Integral classes; the platform-specific Int type is used on the right due to the type of (!!) :: [a] -> Int -> a. If you use the appropriate indexing operator for Integral a you instead get something consistent:

Prelude> :m + Data.List
Prelude Data.List> ((7^7^7`mod`5`mod`2) == 1, genericIndex [False,True] (7^7^7`mod`5`mod`2))
(True,True)

The problem here is not referential transparency because the functions that we're calling ^ are actually two different functions (as they have different types). What has tripped you up is typeclasses, which are an implementation of constrained ambiguity in Haskell; you have discovered that this ambiguity (unlike unconstrained ambiguity -- i.e. parametric types) can deliver counterintuitive results. This shouldn't be too surprising but it's definitely a little strange at times.

CR Drost
  • 9,637
  • 1
  • 25
  • 36
5

A another type has been chosen, because !! requires an Int. The full computation now uses Int instead of Integer.

λ> ( (7^7^7`mod`5`mod`2 :: Int)==1, [False,True]!!(7^7^7`mod`5`mod`2) )
(False,False)
vek
  • 1,523
  • 10
  • 18
  • My question was how this relates to referential transparency. – false Nov 19 '14 at 15:35
  • 1
    Sorry. I only saw people wonder what happened here. I would say that it has nothing to do with referential transparency, because rt guarantees that a value is the same in same context, but here it isn't the same context. – vek Nov 19 '14 at 15:44
  • 6
    @false It doesn't. Referential transparency just means that you can always substitute an expression with the value of that expression without changing the result (given the expression does evaluate to something). This is still true even if you find the result you're getting confusing. – Cubic Nov 19 '14 at 15:44
4

What you think this has to do with referential transparency? Your uses of 7, ^, mod, 5, 2, and == are applications of those variables to dictionaries, yes, but I don't see why you think that fact makes Haskell referentially opaque. Often applying the same function to different arguments produces different results, after all!

Referential transparency has to do with this expression:

let x :: Int = 7^7^7`mod`5`mod`2 in (x == 1, [False, True] !! x)

x is here a single value, and should always have that same single value.

By contrast, if you say:

let x :: forall a. Num a => a; x = 7^7^7`mod`5`mod`2 in (x == 1, [False, True] !! x)

(or use the expression inline, which is equivalent), x is now a function, and can return different values depending on the Num argument you supply to it. You might as well complain that let f = (+1) in map f [1, 2, 3] is [2, 3, 4], but let f = (+3) in map f [1, 2, 3] is [4, 5, 6] and then say "Haskell gives different values for map f [1, 2, 3] depending on the context so it's referentially opaque"!

Jonathan Cast
  • 4,569
  • 19
  • 34
  • 8
    It is reasonable for one (especially if new to Haskell) to expect that an expression which commonly reduces to a value of a particular type (say `Int`) will behave the same way in a range of common contexts (and `x==1` and `!!x` would *surely* be contexts where it is [astonishing](http://en.wikipedia.org/wiki/Principle_of_least_astonishment) to learn that it doesn't work that way). While your answer is not wrong, it does seem to make normative criticism of the OP for *even thinking* this is related to referential transparency, whereas it's *highly counter-intuitive* to learn that it's not. – ely Nov 19 '14 at 16:48
  • @prpl.mnky.dshwshr No, it's _not_ astonishing nor highly counterintuitive that the value of an overloaded function depends on the type of its call context. If you're astonished by that you have to be astonished that `1` can refer to `1.0::Float` or `1::Int` or `Mat [[1,0],[0,1]] :: Num a => Matrix a`. The surprise in this question is all about `Int` overflowing and `Integer` not. jcast might be critical but your response is hyperbole. – AndrewC Nov 20 '14 at 00:05
2

Probably another type-inference and referential-transparency related thing is the „dreaded“ Monomorphism restriction (its absence, to be exact). A direct quote:

An example, from „A History of Haskell“:
Consider the genericLength function, from Data.List

genericLength :: Num a => [b] -> a

And consider the function:

f xs = (len, len) where len = genericLength xs

len has type Num a => a and, without the monomorphism restriction, it could be computed twice.

Notice that in this case types of both expressions are the same. Results are too, but the substitution isn't always possible.

Yuuri
  • 1,858
  • 1
  • 16
  • 26