5

I have a float4x4 struct which simply containts 16 floats:

struct float4x4
{
        public float M11; public float M12; public float M13; public float M14;
        public float M21; public float M22; public float M23; public float M24;
        public float M31; public float M32; public float M33; public float M34;
        public float M41; public float M42; public float M43; public float M44;
}

I want to copy an array of these structs into a big array of floats. This is as far as I know a 1:1 copy of a chunk of memory

What I do know is rather ugly, and not that fast:

        int n = 0;
        for (int i = 0; i < Length; i++)
        {
            array[n++] = value[i].M11;
            array[n++] = value[i].M12;
            array[n++] = value[i].M13;
            array[n++] = value[i].M14;

            array[n++] = value[i].M21;
            array[n++] = value[i].M22;
            array[n++] = value[i].M23;
            array[n++] = value[i].M24;

            array[n++] = value[i].M31;
            array[n++] = value[i].M32;
            array[n++] = value[i].M33;
            array[n++] = value[i].M34;

            array[n++] = value[i].M41;
            array[n++] = value[i].M42;
            array[n++] = value[i].M43;
            array[n++] = value[i].M44;
        }

If I was using a lower level language, I would simply use memcpy, what can I use as an equivilant in C#?

Hannesh
  • 7,256
  • 7
  • 46
  • 80
  • Well, you could use unsafe code. If you get a pointer to `value[i]` you can then use `Marshal.Copy` to copy it to `array`. Whether that's cleaner, I'm not so sure.. – harold Sep 17 '11 at 10:31
  • As a struct, that is quite over-weight... structs do have guideline sizes, FWIW – Marc Gravell Sep 17 '11 at 10:58
  • if you use StructLayout with Explicit and define the size and layout then you can use a simple C++/CLI class that reinterpret casts a pinned pointer to the first entry in the array and memcpy. This has all sorts of caveats but is very fast. I would however argue that, if you are not capable of writing such a thing yourself, you perhaps shouldn't be doing it until you've learnt how. – ShuggyCoUk Sep 17 '11 at 17:18

5 Answers5

4

You can't use a memory copy, as you can't blindly assume anything about how the members are stored inside the structure. The JIT compiler could decide to store them with a few bytes of padding between them, if that would make it faster.

Your structure is way too large for the recommended size of a structure anyway, so you should make it a class. Also, structures should not be mutable, which also talks for a class.

If you store the properties in an array internally, you can use that for copying the values:

class float4x4 {

  public float[] Values { get; private set; } 

  public float4x4() {
    Values = new float[16];
  }

  public float M11 { get { return Values[0]; } set { Values[0] = value; } }
  public float M12 { get { return Values[0]; } set { Values[0] = value; } }
  ...
  public float M43 { get { return Values[14]; } set { Values[14] = value; } }
  public float M44 { get { return Values[15]; } set { Values[15] = value; } }

}

Now you can get the Values array from the object and copy to the array using the Array.CopyTo method:

int n = 0;
foreach (float4x4 v in values) {
  v.Values.CopyTo(array, n);
  n += 16;
}
Guffa
  • 687,336
  • 108
  • 737
  • 1,005
  • 2
    Choosing to make a matrix a value type can be reasonable IMO. For example XNA choose to do that. If you chose a class you need to make it immutable, or else you won't get value semantics which IMO are essential for a matrix. And if you make it immutable you'll need to create new instances too often. So I believe that this is one of the instances where violating the guidelines and using a larger struct can be beneficial. – CodesInChaos Sep 17 '11 at 10:37
  • You **can** assume the layout if you specify a layout yourself. E.g. `[StructLayout(LayoutKind.Sequential)]`. +1 for turning it into a class, because structures like this are inefficient (reason that DirectX/XNA/etc. passes them by ref). – Jonathan Dickinson Sep 17 '11 at 10:42
  • @CodeInChaos: As the structure is mutable, you don't have value semantics anyway... – Guffa Sep 17 '11 at 10:42
  • Despite the problems associated with them, even mutable structs have value semantics because they get copied in the appropriate places. And with careful design most problems associated with them can be avoided(In particular avoid mutating methods unless they get the value passed in byref explicitly). – CodesInChaos Sep 17 '11 at 10:54
  • @Guffa I agree with CodeInChaos here - it stil does behave *as a mutable value*, with copy semantics. Not the easiest setup to reason about, though, and a common cause of head-scratching. – Marc Gravell Sep 17 '11 at 10:57
  • @CodeInChaos: People sometimes choose to make things structs in XNA because garbage collection is much more "penalizing" in XNA. The collector is less sophisticated than the desktop CLR GC and will pause your game frequently to collect. So people try to avoid the GC penalty by avoiding collection pressure. They pay for it via the runtime cost of an inefficient struct copy and the development cost of dealing with bugs caused by mutable value types. Unfortunately it is often a good tradeoff. I wish instead we'd just improve the GC. – Eric Lippert Sep 17 '11 at 15:20
2

This is perhaps equally ugly, but is very fast.

using System.Runtime.InteropServices;

namespace ConsoleApplication23 {
  public class Program {
    public static void Main() {
      var values=new[] {
        new float4x4(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16),
        new float4x4(-1, -2, -3, -4, -5, -6, -7, -8, -9, -10, -11, -12, -13, -14, -15, -16)
      };
      var result=Transform(values);
    }

    public static unsafe float[] Transform(float4x4[] values) {
      var array=new float[values.Length*16];
      fixed(float* arrayStart=array) {
        var destp=arrayStart;
        fixed(float4x4* valuesStart=values) {
          int count=values.Length;
          for(var valuesp=valuesStart; count>0; ++valuesp, --count) {
            var sourcep=valuesp->data;
            for(var i=0; i<16/4; ++i) {
              *destp++=*sourcep++;
              *destp++=*sourcep++;
              *destp++=*sourcep++;
              *destp++=*sourcep++;
            }
          }
        }
        return array;
      }
    }

    [StructLayout(LayoutKind.Explicit)]
    public unsafe struct float4x4 {
      [FieldOffset(0)] public float M11;
      [FieldOffset(4)] public float M12;
      [FieldOffset(8)] public float M13;
      [FieldOffset(12)] public float M14;
      [FieldOffset(16)] public float M21;
      [FieldOffset(20)] public float M22;
      [FieldOffset(24)] public float M23;
      [FieldOffset(28)] public float M24;
      [FieldOffset(32)] public float M31;
      [FieldOffset(36)] public float M32;
      [FieldOffset(40)] public float M33;
      [FieldOffset(44)] public float M34;
      [FieldOffset(48)] public float M41;
      [FieldOffset(52)] public float M42;
      [FieldOffset(56)] public float M43;
      [FieldOffset(60)] public float M44;

      //notice the use of "fixed" keyword to make the array inline
      //and the use of the FieldOffset attribute to overlay that inline array on top of the other fields
      [FieldOffset(0)] public fixed float data[16];

      public float4x4(float m11, float m12, float m13, float m14,
        float m21, float m22, float m23, float m24,
        float m31, float m32, float m33, float m34,
        float m41, float m42, float m43, float m44) {
        M11=m11; M12=m12; M13=m13; M14=m14;
        M21=m21; M22=m22; M23=m23; M24=m24;
        M31=m31; M32=m32; M33=m33; M34=m34;
        M41=m41; M42=m42; M43=m43; M44=m44;
      }
    }
  }
}
Corey Kosak
  • 2,615
  • 17
  • 13
  • Very fast measured relative to what? – Ritch Melton Sep 17 '11 at 13:10
  • Ok. O_o. I'm skeptical that it would have any noticeable effect. – Ritch Melton Sep 17 '11 at 13:30
  • OK, I'm doing some measurements and I see 8.5% improvement. So one one hand that's "noticeable" but on the other hand I shouldn't have called that "very fast". I can post the code if anyone is interested. – Corey Kosak Sep 17 '11 at 13:55
  • What's your test case? A tight loop with a million iterations? – Ritch Melton Sep 17 '11 at 13:56
  • I don't mean to sound so attacky, just skeptical. – Ritch Melton Sep 17 '11 at 13:59
  • It's getting more and more interesting. I'm getting very fast performance on X64. Would it be ok if I posted my test harness as a separate answer to this question (so as to get formatting)? – Corey Kosak Sep 17 '11 at 14:08
  • x64 against a x86 build? If you didn't know, formatting comes from indenting four spaces. You can add as a hint to the highlighter too. – Ritch Melton Sep 17 '11 at 14:34
  • I didn't even know this was possible: **[FieldOffset(0)] public fixed float data[16];** !! In that case, I can just use Array.Copy which is very clean and fast. Thanks a ton for this! – Hannesh Sep 17 '11 at 20:09
1

OK this is my test harness. My project properties are Release Build, "optimize code" and also "Allow unsafe code" checked.

Surprisingly (to me anyway) the performance is very different inside and outside the IDE. When run from the IDE there are noticeable differences (and the x64 difference is huge). When run outside the IDE, it's a wash.

So this is kind of weird, and I can't explain the results for IDE+x64. Maybe this is interesting to some people, but because it no longer purports to provide an answer to the poster's original question, maybe this should be moved to some other topic?

Inside IDE, platform set to x86

pass 1: old 00:00:09.7505625 new 00:00:08.6897013 percent 0.1088

Inside IDE, platform set to x64

pass 1: old 00:00:14.7584514 new 00:00:08.8835715 percent 0.398068858362741

Running from command line, platform set to x86

pass 1: old 00:00:07.6576469 new 00:00:07.2818252 percent 0.0490779615341104

Running from command line, platform set to x64

pass 1: old 00:00:07.2501032 new 00:00:07.3077479 percent -0.00795087992678504

And this is the code:

using System;
using System.Runtime.InteropServices;

namespace ConsoleApplication23 {
  public class Program {
    public static void Main() {
      const int repeatCount=20;
      const int arraySize=5000000;

      var values=MakeValues(arraySize);

      for(var pass=0; pass<2; ++pass) {
        Console.WriteLine("Starting old");
        var startOld=DateTime.Now;
        for(var i=0; i<repeatCount; ++i) {
          var result=TransformOld(values);
        }
        var elapsedOld=DateTime.Now-startOld;

        Console.WriteLine("Starting new");
        var startNew=DateTime.Now;
        for(var i=0; i<repeatCount; ++i) {
          var result=TransformNew(values);
        }
        var elapsedNew=DateTime.Now-startNew;

        var difference=elapsedOld-elapsedNew;
        var percentage=(double)difference.TotalMilliseconds/elapsedOld.TotalMilliseconds;

        Console.WriteLine("pass {0}: old {1} new {2} percent {3}", pass, elapsedOld, elapsedNew, percentage);
      }
      Console.Write("Press enter: ");
      Console.ReadLine();
    }

    private static float4x4[] MakeValues(int count) {
      var result=new float4x4[count];
      for(var i=0; i<count; ++i) {
        result[i]=new float4x4(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15);
      }
      return result;
    }

    public static float[] TransformOld(float4x4[] value) {
      var array=new float[value.Length*16];
      int n = 0;
      for(int i = 0; i < value.Length; i++) {
        array[n++] = value[i].M11;
        array[n++] = value[i].M12;
        array[n++] = value[i].M13;
        array[n++] = value[i].M14;

        array[n++] = value[i].M21;
        array[n++] = value[i].M22;
        array[n++] = value[i].M23;
        array[n++] = value[i].M24;

        array[n++] = value[i].M31;
        array[n++] = value[i].M32;
        array[n++] = value[i].M33;
        array[n++] = value[i].M34;

        array[n++] = value[i].M41;
        array[n++] = value[i].M42;
        array[n++] = value[i].M43;
        array[n++] = value[i].M44;
      }
      return array;
    }

    public static unsafe float[] TransformNew(float4x4[] values) {
      var array=new float[values.Length*16];
      fixed(float* arrayStart=array) {
        var destp=arrayStart;
        fixed(float4x4* valuesStart=values) {
          int count=values.Length;
          for(var valuesp=valuesStart; count>0; ++valuesp, --count) {
            var sourcep=valuesp->data;
            for(var i=0; i<16/4; ++i) {
              *destp++=*sourcep++;
              *destp++=*sourcep++;
              *destp++=*sourcep++;
              *destp++=*sourcep++;
            }
          }
        }
        return array;
      }
    }

    [StructLayout(LayoutKind.Explicit)]
    public unsafe struct float4x4 {
      [FieldOffset(0)] public float M11;
      [FieldOffset(4)] public float M12;
      [FieldOffset(8)] public float M13;
      [FieldOffset(12)] public float M14;
      [FieldOffset(16)] public float M21;
      [FieldOffset(20)] public float M22;
      [FieldOffset(24)] public float M23;
      [FieldOffset(28)] public float M24;
      [FieldOffset(32)] public float M31;
      [FieldOffset(36)] public float M32;
      [FieldOffset(40)] public float M33;
      [FieldOffset(44)] public float M34;
      [FieldOffset(48)] public float M41;
      [FieldOffset(52)] public float M42;
      [FieldOffset(56)] public float M43;
      [FieldOffset(60)] public float M44;

      //notice the use of "fixed" keyword to make the array inline
      //and the use of the FieldOffset attribute to overlay that inline array on top of the other fields
      [FieldOffset(0)] public fixed float data[16];

      public float4x4(float m11, float m12, float m13, float m14,
        float m21, float m22, float m23, float m24,
        float m31, float m32, float m33, float m34,
        float m41, float m42, float m43, float m44) {
        M11=m11; M12=m12; M13=m13; M14=m14;
        M21=m21; M22=m22; M23=m23; M24=m24;
        M31=m31; M32=m32; M33=m33; M34=m34;
        M41=m41; M42=m42; M43=m43; M44=m44;
      }
    }
  }
}
Corey Kosak
  • 2,615
  • 17
  • 13
  • The performance difference is *enormous* if you are comparing code *being debugged* to code not being debugged; is that the issue you are seeing? The jitter generates less aggressively optimized code, and the CLR does a lot of extra work, if the runtime knows that it is being debugged. – Eric Lippert Sep 17 '11 at 15:22
  • Yeah; until now I had no idea how dramatic that difference can be. – Corey Kosak Sep 17 '11 at 22:12
-1

Maybe you could alias the array of structs with an array of floats and do absolutely no copying. Check this SO answer for a starting point

Community
  • 1
  • 1
renick
  • 3,873
  • 2
  • 31
  • 40
  • I just suggested that FieldOffset might save the copy. Since it's evil thinking I'll be sure to mention it during my next confession and ask for forgiveness .. – renick Sep 17 '11 at 10:55
-1

It's not necessarily a 1 to 1 copy. The CLR is free to layout the fields in a struct in whichever way it likes. It might reorder them, realign them.

If you add a [StructLayout(LayoutKind.Sequential)] a direct copy might be possible, but I'd still go with something similar to your original code.

CodesInChaos
  • 106,488
  • 23
  • 218
  • 262