F# automatic generalization and performance

Question

I ran into an unexpected code optimization recently, and wanted to check whether my interpretation of what I was observing was correct. The following is a much simplified example of the situation:

let demo =
   let swap fst snd i =
       if i = fst then snd else
       if i = snd then fst else
       i
   [ for i in 1 .. 10000 -> swap 1 i i ]

let demo2 =
   let swap (fst: int) snd i =
       if i = fst then snd else
       if i = snd then fst else
       i
   [ for i in 1 .. 10000 -> swap 1 i i ]

The only difference between the 2 blocks of code is that in the second case, I explicitly declare the arguments of swap as integers. Yet, when I run the 2 snippets in fsi with #time, I get:

Case 1 Real: 00:00:00.011, CPU: 00:00:00.000, GC gen0: 0, gen1: 0, gen2: 0
Case 2 Real: 00:00:00.004, CPU: 00:00:00.015, GC gen0: 0, gen1: 0, gen2: 0

i.e. the 2nd snippet runs 3 times faster than the first. The absolute performance difference here is obviously not an issue, but if I were to use the swap function a lot, it would pile up.

My assumption is that the reason for the performance hit is that in the first case, swap is generic and "requires equality", and checks whether int supports it, whereas the second case doesn't have to check anything. Is this the reason this is happening, or am I missing something else? And more generally, should I consider automatic generalization a double-edged sword, that is, an awesome feature which may have unexpected effects on performance?

score 11 · Accepted Answer · edited May 23 '17 at 12:11

11

I think this is generally the same case as in the question Why is this F# code so slow. In that question, the performance issue is caused by a constraint requiring comparison and in your case, it is caused by the equality constraint.

In both cases, the compiled generic code has to use interfaces (and boxing), while the specialized compiled code can directly use IL instructions for comparison or equality of integers or floating-point numbers.

The two ways to avoid the performance issue are:

Specialize the code to use int or float as you did
Mark the function as inline so that the compiler specializes it automatically

For smaller functions, the second approach is better, because it does not generate too much code and you can still write functions in a generic way. If you use function just for single type (by design), then using the first approach is probably appropriate.

edited May 23 '17 at 12:11

Community

1
1

answered May 06 '12 at 23:24

Tomas Petricek

240,744
19
378
553

Thanks for the pointer to the other question, indeed very similar. So if I understand correctly, marking a function as inline says "while this function is generic, create a specific version based on the types used where it's called"? Is there a reason then for not marking every math-style function as inline? – Mathias May 06 '12 at 23:47
Or, stated differently, can you elaborate a bit on "it does not generate too much code", which I don't quite understand. – Mathias May 06 '12 at 23:48
1

@Mathias The `inline` keyword means that the compiler will replace call to the function with its implementation. This means that the code of the function will be repeated once for every call. For really long functions, this could make the assembly bigger (or generate longer methods that take long time to JIT). That's why I recommended using this only for shorther functions - the functions in your example look pretty short to me. – Tomas Petricek May 07 '12 at 00:14

score 4 · Answer 2 · answered May 06 '12 at 23:25

The reason for the difference is that the compiler is generating calls to

    IL_0002:  call bool class [FSharp.Core]Microsoft.FSharp.Core.LanguagePrimitives/HashCompare::GenericEqualityIntrinsic<!!0> (!!0, !!0)

in the generic version, whilst the int version can just compare directly.

If you used inline I suspect that this problem would disappear as the compiler now has extra type information

F# automatic generalization and performance

2 Answers2