1

I'm currently trying to benchmark an algorithm that pulls data from a database and performs operations on it, the function takes a little longer then I'd like and I would like to benchmark it so I can monitor any performance increase (as well as demonstrate it to clients). My issue is that the only 'documented' benchmarking library is scalameter and it doesn't really go into depth of how to use it. I'm quite lost in how to make a generator for a custom class called 'User' which generates random users as inputs. Secondly, I'm not quite sure how the benchmarking works with scalameter, what exactly is the Parameters type they use and how do you use it.

Am I even looking in the right direction?

Justin Juntang
  • 121
  • 1
  • 5
  • 3
    I feel your pain, but your question isn't very clear about what you are actually asking. Can you be a bit more concrete? What, exactly, do you need to know? – The Archetypal Paul Sep 26 '14 at 07:12
  • Am I looking to implement the trait for scalameter? Or am I using a composed generators? And how would I do so? I don't really get how the parameters in scalameter work(mainly how to implement) and also how the composed generators work (mainly the task support) – Justin Juntang Sep 28 '14 at 23:12
  • I still don't know what you're asking, specifically. What have you tried? Where are you stuck? SO works better if you ask specific questions, not broad ones like yours so far – The Archetypal Paul Sep 29 '14 at 06:28

1 Answers1

1
  1. Avoid random data in benchmarking, as it can result in performance differences. If you must have random data, make sure to use a seed value.

  2. You very rarely need to create a custom generator class. In almost all usages, you will use for-comprehensions on generators to create custom data values for the benchmark. See documentation link below.

  3. Generators in ScalaMeter are different than those in ScalaCheck. They do not just generate some random values. They generate a well defined set of values that are fed to the benchmark. Usually, these values follow a certain pattern, such as the size of the data. For example, if you are benchmarking List operations, a generator would typically generate lists of different sizes.

  4. You don't say how to create a User, but only say that you do it randomly.

So, let's assume that there is a function newUser that takes an integer, and uses it to create a User:

def newUser(seed: Int): User

If there is any concept of a size of the User object, you can use the seed to impact that size. For example, if your User object has a name field, you can generate a name field of the size seed. This is particularly useful if the size impacts the running time of the operations, as you will later see that dependency on a plot.

The generator of the User objects has the type Gen[User]. We create it by starting from a seed generator:

val seeds = Gen.range("seed")(0, 10, 1)

This generator contains seed integers from 0 until 10. We use them to create a user generator:

val users: Gen[User] = for (seed <- seeds) yield newUser(seed)

This is discussed in the documentation:

http://scalameter.github.io/home/gettingstarted/0.7/

Section on generators:

http://scalameter.github.io/home/gettingstarted/0.7/generators/index.html

Parameters are summarized in the ScalaDoc, but you don't really need them -- instead, use generator comprehensions as the tutorial shows,

axel22
  • 32,045
  • 9
  • 125
  • 137