11

I am starting Haskell and was looking at some libraries where data types are defined with "!". Example from the bytestring library:

data ByteString = PS {-# UNPACK #-} !(ForeignPtr Word8) -- payload
                     {-# UNPACK #-} !Int                -- offset
                     {-# UNPACK #-} !Int                -- length

Now I saw this question as an explanation of what this means and I guess it is fairly easy to understand. But my question is now: what is the point of using this? Since the expression will be evaluated whenever it is need, why would you force the early evaluation?

In the second answer to this question C.V. Hansen says: "[...] sometimes the overhead of lazyness can be too much or wasteful". Is that supposed to mean that it is used to save memory (saving the value is cheaper than saving the expression)?

An explanation and an example would be great!

Thanks!

[EDIT] I think I should have chosen an example without {-# UNPACK #-}. So let me make one myself. Would this ever make sense? Is yes, why and in what situation?

data MyType = Const1 !Int
            | Const2 !Double
            | Const3 !SomeOtherDataTypeMaybeMoreComplex
Community
  • 1
  • 1
o1iver
  • 1,805
  • 2
  • 17
  • 23
  • Laziness can have lots of impact on the code, the main one being that lazy programs often run in linear space when strict ones need constant space. But in your example the real reason probably has to do with the data in question being manipulated by foreign functions which don't know how to force lazy evaluations. – n. m. could be an AI Jun 03 '11 at 19:45
  • Nope. The ForeignPtr can be passed to foreign functions, sure, but not the structure itself - even with strict/unpack annotations there's no defined ABI for native haskell structures – bdonlan Jun 03 '11 at 19:51

1 Answers1

13

The goal here is not strictness so much as packing these elements into the data structure. Without strictness, any of those three constructor arguments could point either to a heap-allocated value structure or a heap-allocated delayed evaluation thunk. With strictness, it could only point to a heap-allocated value structure. With strictness and packed structures, it's possible to make those values inline.

Since each of those three values is a pointer-sized entity and is accessed strictly anyway, forcing a strict and packed structure saves pointer indirections when using this structure.

In the more general case, a strictness annotation can help reduce space leaks. Consider a case like this:

data Foo = Foo Int

makeFoo :: ReallyBigDataStructure -> Foo
makeFoo x = Foo (computeSomething x)

Without the strictness annotation, if you just call makeFoo, it will build a Foo pointing to a thunk pointing to the ReallyBigDataStructure, keeping it around in memory until something forces the thunk to evaluate. If we instead have

data Foo = Foo !Int

This forces the computeSomething evaluation to proceed immediately (well, as soon as something forces makeFoo itself), which avoids leaving a reference to the ReallyBigDataStructure.

Note that this is a different use case than the bytestring code; the bytestring code forces its parameters quite frequently so it's unlikely to lead to a space leak. It's probably best to interpret the bytestring code as a pure optimization to avoid pointer dereferences.

bdonlan
  • 224,562
  • 31
  • 268
  • 324
  • Maybe I should have chosen an example without the packing/unpacking (I don't understand/haven't looked at that yet). In any case what do you mean with "pointer-sized" entity? I guess I am a bit confused about the pointers since, from what I understood, Haskell doesn't really have pointers. Is this some kind of lower-level compiler-dependent thing (that is not part of haskell itself)? – o1iver Jun 03 '11 at 20:09
  • Yes, this is a low-level optimization of the actual memory representation of the structure. – bdonlan Jun 03 '11 at 20:10
  • OK that makes sense. However I have edited my answer and added another example (^^). Is the same answer valid in that case? – o1iver Jun 03 '11 at 20:20
  • Perfect! Exactly what I needed for an explanation. Although I must say, that I would have thought, that the compiler would optimized this away. I mean, from what I understand so far, the point of lazyness is performance... But thanks for a great answer anyway! – o1iver Jun 03 '11 at 21:04
  • The compiler is not allowed to optimize this away (ie, force it) if it can't prove that it won't affect the program's behavior. For example, if `computeSomething` gets into an infinite loop or throws an exception on some inputs, then it can't be strictified, as that would change the behavior of a program which never actually forces the thunk in question. And there are lots of cases where the compiler can't prove that it's safe, even if it is. – bdonlan Jun 03 '11 at 21:09
  • 2
    There are also quite a few cases where extra strictness is safe, the compiler *can* prove it, and will indeed optimize your code. GHC is very clever, even if not omniscient (as of current versions, at least). – C. A. McCann Jun 03 '11 at 21:15