Why are so few things @specialized in Scala's standard library?

Question

I've searched for the use of @specialized in the source code of the standard library of Scala 2.8.1. It looks like only a handful of traits and classes use this annotation: Function0, Function1, Function2, Tuple1, Tuple2, Product1, Product2, AbstractFunction0, AbstractFunction1, AbstractFunction2.

None of the collection classes are @specialized. Why not? Would this generate too many classes?

This means that using collection classes with primitive types is very inefficient, because there will be a lot of unnecessary boxing and unboxing going on.

What's the most efficient way to have an immutable list or sequence (with IndexedSeq characteristics) of Ints, avoiding boxing and unboxing?

Wow, cheez, i assumed that `List` and `IndexedSeq` are specialized. Doing some `:javap -c` with the new scala 2.9 REPL shows they still do boxing all the time. The constructors are optimized though it seems (`List(1,2,3)` uses a `wrapIntArray`). — 0__, Mar 29 '11 at 23:48
That's really unwanted overhead when dealing with huge datasets of numbers. — Raphael, Mar 31 '11 at 07:20

score 18 · Accepted Answer · answered Mar 29 '11 at 21:39

18

Specialization has a high cost on the size of classes, so it must be added with careful consideration. In the particular case of collections, I imagine the impact will be huge.

Still, it is an on-going effort -- Scala library has barely started to be specialized.

answered Mar 29 '11 at 21:39

Daniel C. Sobral

295,120
86
501
681

8

This is correct. Video from Scala Days 2010 on @Specialized, where Iulian Dragos explains why it's not everywhere: http://days2010.scala-lang.org/node/138/151 – Matthew Farwell Mar 30 '11 at 07:53
2

I've also grepped through the source of the Scala 2.9 RC1 library and unfortunately it looks like nothing has changed yet! – Jesper Mar 30 '11 at 17:29
1

@Daniel `Specialization has a high cost on the size of classes, so it must be added with careful consideration` As a naive, non-compiler engineer, why does the size of classes matter? I am **not** making an argument, but looking to learn why. – Kevin Meredith Apr 21 '15 at 15:46
4

@KevinMeredith Well, for one thing, up to Java 7, it was [limited to 64 KB](https://docs.oracle.com/javase/specs/jvms/se7/html/jvms-4.html#jvms-4.9.1). It can also present problems when coding for Android. – Daniel C. Sobral Apr 21 '15 at 19:46

smartnut007 · Answer 2 · 2014-08-19T21:49:39.760

Specialized can be expensive ( exponential ) in both size of classes and compile time. Its not just the size like the accepted answer says.

Open your scala REPL and type this.

import scala.{specialized => sp}
trait S1[@sp A, @sp B, @sp C, @sp D] { def f(p1:A): Unit }

Sorry :-). Its like a compiler bomb.

Now, lets take a simple trait

trait Foo[Int]{ }

The above will result in two compiled classes. Foo, the pure interface and Foo$1, the class implementation.

Now,

trait Foo[@specialized A] { }

A specialized template parameter here gets expanded/rewritten for 9 different primitive types ( void, boolean, byte, char, int, long, short, double, float ). So, basically you end up with 20 classes instead of 2.

Going back to the trait with 5 specialized template parameters, the classes get generated for every combination of possible primitive types. i.e its exponential in complexity.

2 * 10 ^ ^{(no of specialized parameters)}

If you are defining a class for a specific primitive type, you should be more explicit about it such as

trait Foo[@specialized(Int) A, @specialized(Int,Double) B] { }

Understandably one has to be frugal using specialized when building general purpose libraries.

Here is Paul Phillips ranting about it.

But you don't always need to specialze for all eight primitive types plus `void`; you can also just specialize for `Int`, for example, using `@specialized(Int)`. — Jesper, Mar 08 '13 at 19:21
yes, thats right. Added more notes to my answer. But, thats the case when you are doing something specific to a data type. Building general purpose libraries like the the scala language api needs to be more general purpose. — smartnut007, Mar 08 '13 at 20:54

score 6 · Answer 3 · answered Mar 29 '11 at 19:19

6

Partial answer to my own question: I can wrap an array in an IndexedSeq like this:

import scala.collection.immutable.IndexedSeq

def arrayToIndexedSeq[@specialized(Int) T](array: Array[T]): IndexedSeq[T] = new IndexedSeq[T] {
  def apply(idx: Int): T = array(idx)
  def length: Int = array.length
}

(Ofcourse you could still modify the contents if you have access to the underlying array, but I would make sure that the array isn't passed to other parts of my program).

answered Mar 29 '11 at 19:19

Jesper

202,709
46
318
350

3

Since you are hiding behind IndexedSeq that still doesn't convince the compiler not to use autoboxing. For instance, if i run the `:javap -c` on the following: `object Test { val a = arrayToIndexedSeq( Array(1,2,3)); val i: Int = a(0)}` -- you will see that val i = a(0) _does_ boxing (in `SeqLike`).... – 0__ Mar 29 '11 at 23:53
@Sciss Ok... grrr! Is there another way to get rid of the unnecessary autoboxing? Is directly using mutable arrays the only way to avoid it? – Jesper Mar 30 '11 at 06:38
2

I don't know. I guess we need to make a petition to add more specializations to the collections. Martin Odersky kind of promised that here: http://www.scala-lang.org/node/7285 ; the problem i guess is due to the heavy factoring out of traits in the new collections that if you want Vector to be specialized, you will end up having to specialize a dozen involved traits, which in turn creates high bloat on the size of the library. Still, I think we should vote for that, it definitely sucks to have weak performance for Vector which is really the number crunching swiss army knife in scala. – 0__ Mar 30 '11 at 13:29
@Sciss I've carefully looked at the output of `scalac -print ...` and following the sequence of methods being called, it looks like it actually does avoid boxing when I call `apply()` on it. (Using scala 2.8.1). There are some calls to `scala.Int.box()` but those are not called for that code path. – Jesper Mar 31 '11 at 21:34
Your collection however does have one advantage: it does not box stored elements, and that's a huge memory saving if the collection is (moderately) large. An integer is 4 bytes, boxing will probably add at least other 16 bytes to that (for object headers and the pointers to objects). When you extract one int, only that is boxed. – Blaisorblade Jul 19 '11 at 10:21

Why are so few things @specialized in Scala's standard library?

3 Answers3

Linked