C# Array access vs C++ PInvoke pointer access

Question

I've got an idea of optimising a large jagged array. Let's say i got in c# array

struct BlockData
{
    internal short type;
    internal short health;
    internal short x;
    internal short y;
    internal short z;
    internal byte connection;
}
BlockData[][][] blocks = null;
byte[] GetBlockTypes() 
{
   if (blocks == null)
      blocks = InitializeJaggedArray<BlockData[][][]>(256, 64, 256);
//BlockData is struct
MemoryStream stream = new MemoryStream();
for (int x = 0; x < blocks.Length; x++)
{
    for (int y = 0; y < blocks[x].Length; y++)
    {
       for (int z = 0; z < block[x][y].Length; z++)
       {
           stream.WriteByte(blocks[x][y][z].type);
       }
    }
}
return stream.ToArray();
}

Would storing the Blocks as a BlockData***in C++ Dll and then using PInvoke to read/write them be more efficient than storing them in C# arrays?

Note. I'm unable to perform tests right now because my computer is right now at service.

Why do you think a .NET array would be inefficient? The layout and total-space consumed by the arrays in the CLR would be very similar to the layout by whatever C/C++ allocator you're using (note that both the CLR and C/C++ compilers will add padding-bytes in-between struct fields). — Dai, May 04 '20 at 23:47

Dai · Answer 1 · 2020-05-04T23:51:47.390

1

Would storing the Blocks as a BlockData***in C++ Dll and then using PInvoke to read/write them be more efficient than storing them in C# arrays?

No, because P/Invoke has a significant overhead, whereas array access in C# .NET is compiled at runtime by the JIT to fairly efficient code with bounds-checks. Jagged-arrays in .NET also have adequate performance (the only weak-area in .NET is true multidimensional arrays, which is disappointing - but I don't believe your proposal would help that either).

Update: Multidimensional array performance in .NET Core actually seems worse than .NET Framework (if I'm reading this thread correctly).

edited May 04 '20 at 23:51

answered May 04 '20 at 23:46

Dai

141,631
28
261
374

And what about .NET Core multidimensional array, is it faster than jagged? – May 04 '20 at 23:48
@MattBrewer I've updated my answer. As RyuJIT is used in both .NET Core and .NET Framework 4.6 (but only for x64) and it gets worse benchmark perf in .NET Core it seems it's actually gotten worse... – Dai May 04 '20 at 23:52
Damn and my application is performance critical, guess there's nothing i can do – May 05 '20 at 00:05
@MattBrewer If performance is so critical that you feel you need a _microoptimization_ for arrays then you really shouldn't be using C# + .NET in the first place. – Dai May 05 '20 at 00:35

Christopher · Accepted Answer · 2020-05-05T00:15:51.803

This sounds like a question where you should first read the speed rant, starting at part 2: https://ericlippert.com/2012/12/17/performance-rant/

This is such a miniscule difference - if it matters you are probably in a realtime scenario. And .NET is the wrong choice for realtime scenarios to begin with. If you are in a realtime scenario, this is not going to be the only thing you have to wear off GC Memory Management and security checks.

It is true that accessing a array in Native C++ is faster then acessing it in .NET. .NET has the indexers as proper function calls, similar to properties. And .NET does verify in the Index is valid. However, it is not as bad as you might think. The optimisations are pretty good. Function calls can be inlined. Array access will be pruned with a temporary variable if possible. And even the array check is not save from sensible removal. So it is not as big a advantage as you might think.

As others pointed out, P/Invoke will consume any gains there might be, with it's overhead. But actually going into a different environment is unnecessary:

The thing is, you can also use naked pointers in .NET. You have to enable it with unsafe code, but it is there. You can then acquire a piece of unmanaged memory and treat it like a array in native C++. Of course that subjects to to mistakes like messing up the pointer arithmetic or overflow - the exact reasons those checks exist in the first place!

score 0 · Answer 3 · answered May 05 '20 at 00:32

Another way to look at it - GC and overall maintanance. Your proposal is essentially the same as allocated one big array and using (layer * layerSize + row * rowSize + column) for indexing it. PInvoke will give you following drawbacks:

you likely endup with unmanaged allocation for the array. This make GC unaware of large amount of allocated memory and you need to make sure to notify GC about it.
PInvoked calls can't be completely inlined unlike all .Net code during JIT
you need to maintain code in two languages
PInvoke is not as portable - requires platform/bitness specific libraries to deal with and add a lot of fun when sharing your program.

and one possible gain:

removing boundary checks performed by .Net on arrays

Back of a napkin calculation shows that at best both will balance out in raw performance. I'd go with .Net-only version as it is easier to maintain, less fun with GC.

Additionally when you hide chunk auto-generation/partially generated chunks behind index method of the chunk it is easier to write code in a single language... In reality the fact that fully populated chunks are very memory consuming your main issue would likely be memory usage/memory access cost rather than raw performance of iterating through elements. Try and measure...

C# Array access vs C++ PInvoke pointer access

3 Answers3