I'm processing lots of data in a 3D grid so I wanted to implement a simple iterator instead of three nested loops. However, I encountered a performance problem: first, I implemented a simple loop using only int x, y and z variables. Then I implemented a Vector3I structure and used that - and the calculation time doubled. Now I'm struggling with the question - why is that? What did I do wrong?
Example for reproduction:
using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Running;
using System.Runtime.CompilerServices;
public struct Vector2I
{
public int X;
public int Y;
public int Z;
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public Vector2I(int x, int y, int z)
{
this.X = x;
this.Y = y;
this.Z = z;
}
}
public class IterationTests
{
private readonly int _countX;
private readonly int _countY;
private readonly int _countZ;
private Vector2I _Vector = new Vector2I(0, 0, 0);
public IterationTests()
{
_countX = 64;
_countY = 64;
_countZ = 64;
}
[Benchmark]
public void NestedLoops()
{
int countX = _countX;
int countY = _countY;
int countZ = _countZ;
int result = 0;
for (int x = 0; x < countX; ++x)
{
for (int y = 0; y < countY; ++y)
{
for (int z = 0; z < countZ; ++z)
{
result += ((x ^ y) ^ (~z));
}
}
}
}
[Benchmark]
public void IteratedVariables()
{
int countX = _countX;
int countY = _countY;
int countZ = _countZ;
int result = 0;
int x = 0, y = 0, z = 0;
while (true)
{
result += ((x ^ y) ^ (~z));
++z;
if (z >= countZ)
{
z = 0;
++y;
if (y >= countY)
{
y = 0;
++x;
if (x >= countX)
{
break;
}
}
}
}
}
[Benchmark]
public void IteratedVector()
{
int countX = _countX;
int countY = _countY;
int countZ = _countZ;
int result = 0;
Vector2I iter = new Vector2I(0, 0, 0);
while (true)
{
result += ((iter.X ^ iter.Y) ^ (~iter.Z));
++iter.Z;
if (iter.Z >= countZ)
{
iter.Z = 0;
++iter.Y;
if (iter.Y >= countY)
{
iter.Y = 0;
++iter.X;
if (iter.X >= countX)
{
break;
}
}
}
}
}
[Benchmark]
public void IteratedVectorAvoidNew()
{
int countX = _countX;
int countY = _countY;
int countZ = _countZ;
int result = 0;
Vector2I iter = _Vector;
iter.X = 0;
iter.Y = 0;
iter.Z = 0;
while (true)
{
result += ((iter.X ^ iter.Y) ^ (~iter.Z));
++iter.Z;
if (iter.Z >= countZ)
{
iter.Z = 0;
++iter.Y;
if (iter.Y >= countY)
{
iter.Y = 0;
++iter.X;
if (iter.X >= countX)
{
break;
}
}
}
}
}
}
public static class Program
{
public static void Main(string[] args)
{
BenchmarkRunner.Run<IterationTests>();
}
}
What I measured:
Method | Mean | Error | StdDev |
----------------------- |---------:|----------:|----------:|
NestedLoops | 333.9 us | 4.6837 us | 4.3811 us |
IteratedVariables | 291.0 us | 0.8792 us | 0.6864 us |
IteratedVector | 702.1 us | 4.8590 us | 4.3073 us |
IteratedVectorAvoidNew | 725.8 us | 6.4850 us | 6.0661 us |
Note: the 'IteratedVectorAvoidNew' is there due to discussion that the problem might lie in the new
operator of Vector3I - originally, I used a custom iteration loop and measured with a stopwatch.
Additionally, a benchmark of when I iterate over a 256×256×256 area:
Method | Mean | Error | StdDev |
----------------------- |---------:|----------:|----------:|
NestedLoops | 18.67 ms | 0.0504 ms | 0.0446 ms |
IteratedVariables | 18.80 ms | 0.2006 ms | 0.1877 ms |
IteratedVector | 43.66 ms | 0.4525 ms | 0.4232 ms |
IteratedVectorAvoidNew | 43.36 ms | 0.5316 ms | 0.4973 ms |
My environment:
- Intel(R) Core(TM)2 Quad CPU Q6600 @ 2.40GHz
- Windows 10, 64 bit
- Visual Studio 2017
- Language: C#
- Yes, I selected Release configuration
Notes:
My current task is to rewrite existing code to a) support more features, b) be faster. Also I'm working on lots of data - this is the current bottleneck of the whole application so no, it's not a premature optimization.
Rewriting nested loops into one - I'm not trying to optimize there. I just need to write such iterations many times, so simply wanted to simplify the code, nothing more. But because it's a performance-critical part of the code, I'm measuring such changes in design. Now, when I see that simply by storing three variables into a struct I double the processing time... I'm quite scared of using structs like that...