28

I wonder why Arbitrary is needed because automated property testing requires property definition, like

val prop = forAll(v: T => check that property holds for v)

and value v generator. The user guide says that you can create custom generators for custom types (a generator for trees is exemplified). Yet, it does not explain why do you need arbitraries on top of that.

Here is a piece of manual

implicit lazy val arbBool: Arbitrary[Boolean] = Arbitrary(oneOf(true, false))

To get support for your own type T you need to define an implicit def or val of type Arbitrary[T]. Use the factory method Arbitrary(...) to create the Arbitrary instance. This method takes one parameter of type Gen[T] and returns an instance of Arbitrary[T].

It clearly says that we need Arbitrary on top of Gen. Justification for arbitrary is not satisfactory, though

The arbitrary generator is the generator used by ScalaCheck when it generates values for property parameters.

IMO, to use the generators, you need to import them rather than wrapping them into arbitraries! Otherwise, one can argue that we need to wrap arbitraries also into something else to make them usable (and so on ad infinitum wrapping the wrappers endlessly).

You can also explain how does arbitrary[Int] convert argument type into generator. It is very curious and I feel that these are related questions.

Val
  • 1
  • 8
  • 40
  • 64

4 Answers4

28

forAll { v: T => ... } is implemented with the help of Scala implicits. That means that the generator for the type T is found implicitly instead of being explicitly specified by the caller.

Scala implicits are convenient, but they can also be troublesome if you're not sure what implicit values or conversions currently are in scope. By using a specific type (Arbitrary) for doing implicit lookups, ScalaCheck tries to constrain the negative impacts of using implicits (this use also makes it similar to Haskell typeclasses that are familiar for some users).

So, you are entirely correct that Arbitrary is not really needed. The same effect could have been achieved through implicit Gen[T] values, arguably with a bit more implicit scoping confusion.

As an end-user, you should think of Arbitrary[T] as the default generator for the type T. You can (through scoping) define and use multiple Arbitrary[T] instances, but I wouldn't recommend it. Instead, just skip Arbitrary and specify your generators explicitly:

val myGen1: Gen[T] = ...
val mygen2: Gen[T] = ...

val prop1 = forAll(myGen1) { t => ... }
val prop2 = forAll(myGen2) { t => ... }

arbitrary[Int] works just like forAll { n: Int => ... }, it just looks up the implicit Arbitrary[Int] instance and uses its generator. The implementation is simple:

def arbitrary[T](implicit a: Arbitrary[T]): Gen[T] = a.arbitrary

The implementation of Arbitrary might also be helpful here:

sealed abstract class Arbitrary[T] {
  val arbitrary: Gen[T]
}
Rickard Nilsson
  • 1,393
  • 11
  • 11
  • 2
    Wait. I missed why arbitraries are easier to disambiguate than generators. – Val Jul 01 '15 at 15:44
  • 1
    They are not easier to disambiguate, but they limit the scope through semantics. When a method requires an implicit Arbitrary, it means it want the default generator for a type. You could imagine another type class called EdgeCase, implemented just like Arbitrary, but with the semantic intention of representing just edge case generators for a type. Implicit EdgeCase values would then not compete with implicit Arbitrary values during implicit lookup. – Rickard Nilsson Jul 02 '15 at 17:02
  • It seems, one could just as well request an implicit Gen[T] to request the default; you can then pass the default explicitly if you want. – Blaisorblade Oct 31 '17 at 21:00
13

ScalaCheck has been ported from the Haskell QuickCheck library. In Haskell type-classes only allow one instance for a given type, forcing you into this sort of separation. In Scala though, there isn't such a constraint and it would be possible to simplify the library. My guess is that, ScalaCheck being (initially written as) a 1-1 mapping of QuickCheck, makes it easier for Haskellers to jump into Scala :)

Here is the Haskell definition of Arbitrary

class Arbitrary a where
  -- | A generator for values of the given type.
  arbitrary :: Gen a

And Gen

newtype Gen a

As you can see they have a very different semantic, Arbitrary being a type class, and Gen a wrapper with a bunch of combinators to build them.

I agree that the argument of "limiting the scope through semantic" is a bit vague and does not seem to be taken seriously when it comes to organizing the code: the Arbitrary class sometimes simply delegates to Gen instances as in

/** Arbirtrary instance of Calendar */
implicit lazy val arbCalendar: Arbitrary[java.util.Calendar] =
  Arbitrary(Gen.calendar)

and sometimes defines its own generator

/** Arbitrary BigInt */
implicit lazy val arbBigInt: Arbitrary[BigInt] = {
  val long: Gen[Long] =
    Gen.choose(Long.MinValue, Long.MaxValue).map(x => if (x == 0) 1L else x)

  val gen1: Gen[BigInt] = for { x <- long } yield BigInt(x)
  /* ... */

  Arbitrary(frequency((5, gen0), (5, gen1), (4, gen2), (3, gen3), (2, gen4)))
}

So in effect this leads to code duplication (each default Gen being mirrored by an Arbitrary) and some confusion (why isn't Arbitrary[BigInt] not wrapping a default Gen[BigInt]?).

Bruno Bieth
  • 2,317
  • 20
  • 31
0

My reading of that is that you might need to have multiple instances of Gen, so Arbitrary is used to "flag" the one that you want ScalaCheck to use?

lmm
  • 17,386
  • 3
  • 26
  • 37
0

TL;DR - Type variance, Gen[+T] vs Arbitrary[T]

The only semantic difference is Gen being covariant (defined as Gen[+T], e.g. Gen[Cat] is a subtype of Gen[Pet]), while Arbitrary being invariant (Arbitrary[T], so Arbitrary[Cat] has no subtype relevance to Arbitrary[Pet]).

This seemingly little difference significantly changes the way implicit resolution works - scala will pick the most specific implicit if they have same priority.

Explanatory example

If our tests rely on Arbitrary, it's fine to have all Arbitrary[Pet], Arbitrary[Dog] and Arbitrary[BullDog] in one scope. They will emit different dogs and cats for general Pet properties, different dog species for Dog-related tests, and BullDogs for specific BullDog checks.

But if our properties were using covariant Gen instead we might be surprised that all our generic Pet and Dog properties are tested with generated BullDog instances only - that's because Gen[BullDog] is a most specific Gen[Pet]

Ivan Klass
  • 6,407
  • 3
  • 30
  • 28