231

What is the difference when I write this?

data Book = Book Int Int

versus

newtype Book = Book (Int, Int) -- "Book Int Int" is syntactically invalid
SunnyIsaLearner
  • 750
  • 2
  • 13
  • 26
ewggwegw
  • 4,262
  • 5
  • 26
  • 27
  • Related to http://stackoverflow.com/questions/2649305/why-is-there-data-and-newtype-in-haskell – Don Stewart May 04 '11 at 21:06
  • Also related: uses for newtype: http://stackoverflow.com/questions/991467/haskell-type-vs-newtype-with-respect-to-type-safety – Don Stewart May 04 '11 at 21:23
  • 28
    Note that `newtype Book = Book Int Int` isn't valid. You can however, have `newtype Book = Book (Int, Int)` as noted by dons below. – Edward Kmett May 05 '11 at 02:30
  • In addition to @EdwardKMETT's comment I think `Book Int Int` is rather *semantically invalid* because `newtype` can only have *one* value constructor with *only one* field. `Book Int Int` has two fields. – LRDPRDX Aug 06 '21 at 03:19

2 Answers2

290

Great question!

There are several key differences.

Representation

  • A newtype guarantees that your data will have exactly the same representation at runtime, as the type that you wrap.
  • While data declares a brand new data structure at runtime.

So the key point here is that the construct for the newtype is guaranteed to be erased at compile time.

Examples:

  • data Book = Book Int Int

data

  • newtype Book = Book (Int, Int)

newtype

Note how it has exactly the same representation as a (Int,Int), since the Book constructor is erased.

  • data Book = Book (Int, Int)

data tuple

Has an additional Book constructor not present in the newtype.

  • data Book = Book {-# UNPACK #-}!Int {-# UNPACK #-}!Int

enter image description here

No pointers! The two Int fields are unboxed word-sized fields in the Book constructor.

Algebraic data types

Because of this need to erase the constructor, a newtype only works when wrapping a data type with a single constructor. There's no notion of "algebraic" newtypes. That is, you can't write a newtype equivalent of, say,

data Maybe a = Nothing
             | Just a

since it has more than one constructor. Nor can you write

newtype Book = Book Int Int

Strictness

The fact that the constructor is erased leads to some very subtle differences in strictness between data and newtype. In particular, data introduces a type that is "lifted", meaning, essentially, that it has an additional way to evaluate to a bottom value. Since there's no additional constructor at runtime with newtype, this property doesn't hold.

That extra pointer in the Book to (,) constructor allows us to put a bottom value in.

As a result, newtype and data have slightly different strictness properties, as explained in the Haskell wiki article.

Unboxing

It doesn't make sense to unbox the components of a newtype, since there's no constructor. While it is perfectly reasonable to write:

data T = T {-# UNPACK #-}!Int

yielding a runtime object with a T constructor, and an Int# component. You just get a bare Int with newtype.


References:

Community
  • 1
  • 1
Don Stewart
  • 137,316
  • 36
  • 365
  • 468
  • 3
    I still don't think I'd miss something if there was no "newtype" in Haskell. The subtle differences add complexity to the language that don't seem worthwile to me... – martingw May 06 '11 at 11:47
  • 20
    The difference is very useful for performance reasons. Since newtype constructors are erased at compile time, they don't impose the runtime performance penalty that a data constructor does. But they still give you all the benefits of a completely distinct type and whatever abstractions you want to associate with it. For instance, there are two different ways the list data type can form a monad. One is built into the language, but if you wanted to use the other one, a newtype would be the way to go. – mightybyte May 06 '11 at 12:59
  • 1
    Great explanation! What I don't understand is if `newtype` is erased after compilation and runtime uses the same representation for old and new types, how can we still be able to define instances for both old and new type? How can runtime understand which instance to use? – Konstantin Milyutin Feb 09 '16 at 11:07
  • 5
    @damluar All types are erased at runtime, they are all fully resolved at compile time, and during compilation `newtype` is obviously not yet erased. – semicolon Apr 21 '16 at 18:32
  • 7
    @damlaur I once had the same question as you. When people say the types are erased, they omit to mention that one thing ISN'T erased, which is a memory word that is used for dictionary lookups to decide what instance method to use for a given piece of data. People argue that this word isn't a "type", which I think depends on your perspective, but there you go. – Gabriel L. Aug 28 '18 at 15:48
  • As an extension to the original question; in instances where `newtype` can be used, is there a benefit for using `data` instead? Should `newtype` be in every case that qualifies for it? – Sledge Jan 16 '20 at 19:13
  • @mightybyte, why would one ever want to use `data SingleConstructorType = S T` instead of `newtype SingleConstructorType' = S' T`? When is the extra layer of distinct bottom state useful for a single-constructor, single-member data type? – A Sz Apr 20 '20 at 13:50
  • @ASz I can't think of any reason (although there may still be one). If my data type has a single field, I'd start out using a `newtype`. Then if I needed to add more fields later, I'd switch it to `data`. – mightybyte Apr 21 '20 at 17:08
  • 1
    @ASz Additionally, if your data type is recursive, you can't use `newtype`. Like: `data List a = Either () (a, List a)` – Xwtek Nov 06 '20 at 02:59
0

They are different in semantics.

  • data defines a GADT (product type, sum type, etc)
  • newtype defines an isomorphism.

When you don't care about whether it's isomorphic, you should use data, even though it has only 1 field.

For example,

data Student = Student {
    age :: Int
}

If in this problem domain where age is the only infomation you have to process about a student, you should use data rather than newtype, because you never mean that a student should be isomorphic to an age.

Zim
  • 1,528
  • 1
  • 10
  • 6