11

I'm trying to get what I call measurement units system by wrapping double into struct. I have C# structures like Meter, Second, Degree, etc. My original idea was that after compiler is inlined everything I would have a performance the same as if double were used.

My explicit and implicit operators are simple and straightforward, and compiler does actually inline them, yet the code with Meter and Second is 10 times slower than the same code using double.

My question is being: why cannot C# compiler make the code using Second as optimal as the code using double if it inlines everything anyway?

Second is defined as following:

struct Second
{
    double _value; // no more fields.

    public static Second operator + (Second left, Second right) 
    { 
        return left._value + right._value; 
    }
    public static implicit Second operator (double value) 
    { 
        // This seems to be faster than having constructor :)
        return new Second { _value = value };
    }

    // plenty of similar operators
}

Update:

I didn't ask if struct fits here. It does.

I didn't ask if code is going to be inlined. JIT does inline it.

I checked assembly operations emitted in runtime. They were different for code like this:

var x = new double();
for (var i = 0; i < 1000000; i++)
{ 
    x = x + 2;
    // Many other simple operator calls here
}

and like this:

var x = new Second();
for (var i = 0; i < 1000000; i++)
{ 
    x = x + 2;
    // Many other simple operator calls here
}

There were no call instructions in disassembly, so operations were in fact inlined. Yet the difference is significant. Performance tests show that using Second is like 10 times slower than using double.

So my questions are (attention!): why is JIT generated IA64 code is different for the cases above? What can be done to make struct run as fast as double? It seems there no theoretical difference between double and Second, what is the deep reason of difference I saw?

Ilya
  • 111
  • 3
  • 1
    is that an `implicit` or `+` operator? – Marc Gravell Nov 10 '10 at 14:04
  • You might be interested in this related question: http://stackoverflow.com/questions/348853/units-of-measure-in-c-almost – Benjol Nov 10 '10 at 14:09
  • 1
    I know you are using C#, but have you considered F#? It has built in static units checking which is sort of what you seem to be looking for. See http://stackoverflow.com/questions/40845/how-do-f-units-of-measure-work – Peter M Nov 10 '10 at 14:10
  • 2
    Very similar to this: http://stackoverflow.com/questions/3995920/using-real-world-units-instead-of-types/3995971#3995971 And in my experience the compiler is relatively good with optimizing structs. But you need to compile as `release` and have **no debugger attached**. – CodesInChaos Nov 10 '10 at 14:11
  • I did something like this a long time ago to seperate "width" from "height" as we had lots of methods that took them in different orders. There was implicit casts between int and the types but not between themselfs. It worked well for the given project. – Ian Ringrose Nov 10 '10 at 14:55
  • Operands of reference types or known primitive types can be passed in registers when a routine is called. I don't think that's possible for an operand of a structure type, even if the only thing in the structure is a reference or a known primitive type. – supercat Apr 16 '13 at 16:00

2 Answers2

4

This is my opinion, please write a comment if you disagree, instead of silent downvoting.

C# Compiler doesn't inline it. JIT compiler might, but this is indeterministic for us, because JITer's behavior is not straightforward.

In case of double no operators are actually invoked. Operands are added right in stack using opcode add. In your case method op_Add is invoked plus three struct copying to and from stack.

To optimize it start with replacing struct with class. It will at least minimize amount of copies.

Andrey
  • 59,039
  • 12
  • 119
  • 163
  • 2
    ...and structs will get you in trouble along the way. – Paul Sasik Nov 10 '10 at 14:16
  • Why do you want to use classes here instead of structs? The problems with his appoach(the giant amount of operators and types) won't change if he usess `class` but the performance will probably drop a lot. – CodesInChaos Nov 10 '10 at 14:42
  • @CodeInChaos i never use `struct`. I really think that it should be used only with interop. "performance will probably drop a lot" why? explain please. – Andrey Nov 10 '10 at 16:57
  • 1
    @Andrey: Creation of new heap object instances is comparatively expensive. If each computation requires the construction of a new heap object instance, that cost will get added to every one of them. Not only are value-type instances cheaper to create, but code which is written to make efficient use of such types can reuse instances, avoiding the need for such allocations altogether. – supercat Apr 16 '13 at 15:54
  • @supercat why is every computation will involve creating an instance? Also best minds of programming recommend making `struct`s immutable, so no reuse. – Andrey Apr 16 '13 at 20:01
  • @Andrey: If one has a class object instance with the value 3, and another with the value 4, and one wants to have a reference to a class object instance whose value is the sum of those, how will can one quickly produce a reference without creating a new instance for it to point to? As for structs, a statement `struct1 = struct2` mutates instance struct1 by copying all fields from the corresponding ones in struct2; the statement `struct1 = new StructType(params)` will often pass a byref to struct1 to the constructor, which will then mutate it to hold the new fields. – supercat Apr 16 '13 at 20:52
  • @Andrey: There are some cases in which `struct1 = new StructType(params)` will generate a temporary instance, have the constructor mutate that as desired, and then mutate `struct1` by copying all fields from the temporary instance. In every case, though, assigning something to a struct variable mutates the instance represented thereby as opposed to making the variable represent a different instance. The only thing "immutable" structures do is force code that wants to mutate part of a structure to copy all the fields. – supercat Apr 16 '13 at 20:55
1

The C# compiler doesn't inline anything - the JIT might do that, but isn't obliged to. It should still be plenty fast though. I would probably remove the implicit conversion in the + though (see the constructor usage below) - one more operator to look through:

private readonly double _value;
public double Value { get { return _value; } }
public Second(double value) { this._value = value; }
public static Second operator +(Second left, Second right) {
    return new Second(left._value + right._value);
}
public static implicit operator Second(double value)  {
    return new Second(value);
}

JIT inlining is limited to specific scenarios. Will this code satisfy them? Hard to tell - but it should work and work fast enough for most scenarios. The problem with + is that there is an IL opcode for adding doubles; it does almost no work - where-as your code is calling a few static methods and a constructor; there is always going to be some overhead, even when inlined.

Marc Gravell
  • 1,026,079
  • 266
  • 2,566
  • 2,900
  • I did wrap an `int` into a struct(implementing fixed-point) and when the jitter inlined the code(IMO it should inline code more aggressively) it produced perfect assembly code. So if inlined there is probably no overhead at all. – CodesInChaos Nov 10 '10 at 14:46