4

Can someone please help me understand why is accessing arrays using a linear increment for index approximately 3-4 times faster than using random index?

Is there any way of making the random index access times faster?

Please consider the following test code, it return ~3 seconds for linear, ~9-10s for random:

    public static void test()
    {
        var arr = new byte[64 * 1024 * 1024];
        byte b = 0;

        var sw = new Stopwatch();

        double timeSum = 0;

        for (var i = 0; i < arr.Length; i++)
        {

            sw.Restart();
            b = arr[i];
            sw.Stop();
            timeSum += sw.Elapsed.TotalMilliseconds;
        }


        Console.WriteLine("Linear access : " + timeSum + " ms");


        timeSum = 0;

        var rng = new Random();
        var rnum = 0;
        for (var i = 0; i < arr.Length; i++)
        {
            rnum = rng.Next(0, arr.Length - 1);
            sw.Restart();
            b = arr[rnum];
            sw.Stop();
            timeSum += sw.Elapsed.TotalMilliseconds;
        }

        sw.Stop();

        Console.WriteLine("Random access : " + timeSum + " ms");

    }
JKurcik
  • 230
  • 1
  • 12
  • 3
    This is the way the CPU caches work. It loads data by contiguous blocks, so if you access the array from beginning to end, you have a high chance that the data is already in cache – Kevin Gosse Nov 25 '18 at 17:31
  • 2
    https://stackoverflow.com/questions/3928995/how-do-cache-lines-work – Kevin Gosse Nov 25 '18 at 17:32
  • @KevinGosse I agree, so but as an answer. – ctrl-alt-delor Nov 25 '18 at 17:33
  • 1
    @InBetween That's a fair point. I reopened the question – Kevin Gosse Nov 25 '18 at 18:13
  • @InBetween thank you for your input. What would be a more realistic test? Also IF these tests would really show a performance increase in sequential access, would this increase then be caused by what KevinGosse pointed out? – JKurcik Nov 25 '18 at 18:17
  • Also please forgive the "awful" job i did at pointing out the difference. My aim was to come with some evidence that there IS a difference, because i have seen this trend in our application - when the data is accessed randomly, the response times shoot up. – JKurcik Nov 25 '18 at 18:20
  • @JKurcik Sorry about that, I was too harsh. Benchmarking is hard, and this one is tricky too. – InBetween Nov 25 '18 at 18:44
  • @JKurcik And surprisingly enough, with arrays the size you are benchmarking, the performance loss in random access is staggering. I hope somebody looks into my benchmark and sees if something I'm doing is wrong. My previous comment was based on small arrays, but this blows up amazingly fast with big arrays. – InBetween Nov 25 '18 at 18:56
  • @InBetween Even the difference between L1 cache access and L2 is huge. Then if the CPU has to fetch the value from the main memory, you can expect the access times to go through the roof. Some reference numbers here https://stackoverflow.com/a/4087315/869621 – Kevin Gosse Nov 25 '18 at 19:33
  • @InBetween It should be noted that .NET inserts bound-checks when accessing an array. This means fetching the length at the array, which is at the very beginning of the array's memory. I don't know how it impacts the benchmark with large arrays, but you may be able to get more consistent result by replacing the array accesses by pointer accesses (using the `unsafe` keyword) – Kevin Gosse Nov 25 '18 at 19:35
  • @KevinGosse Thanks for all the input. I guess it *was* a dupe after all, my apologies! – InBetween Nov 25 '18 at 19:42
  • @KevinGosse as stated in a comment below InBetween's answer, using pointer access yielded signicficant improvement. Does fixing the array eliminate bound checks? – JKurcik Nov 25 '18 at 19:43
  • 1
    @JKurcik Fixing the array no. Accessing the value through pointers yes – Kevin Gosse Nov 25 '18 at 19:44

1 Answers1

3

The differences you are seeing in your benchmark (4 to 5 times) can't be explained only by cache lines and sequential access to arrays. Its true that sequential predictable access will be faster but, unless you are managing huge arrays, I'd be surprised if performance gains were even close to those figures.

EDIT With the size of arrays in your benchmark (64x 1024x1024) the difference is staggering, way more than I expected, to be honest. So my first impression was dead wrong!

The problem is your benchmark. You are micro measuring; there is no way you can measure with any sort of confidence individual look ups with System.Diagnostics.Stopwatch.

Trying to come up with a fair benchmark is surprisingly tricky because there is no easy way to isolate the randomness generation from the look up. I haven't given it much thought, but the following at least tries to compare apples with apples: the trick is to pregenerate random and sequential arrays and then benchmark double look ups:

static void Main(string[] args)
{
    lookUpArray(1, new[] { 0 }, new[] {0}); //warmup JITTER

    var r = new Random();
    const int arraySize = 10000;
    const int repetitions = 10000;

    var lookupArray = new int[arraySize]; //values dont matter
    var sequentialArray = Enumerable.Range(0, arraySize).ToArray();
    var randomArray = sequentialArray.Select(i => r.Next(0, arraySize)).ToArray();

    for (var i = 0; i < 10; i++)
    {
        var sw = Stopwatch.StartNew();
        lookUpArray(repetitions, lookupArray, randomArray);
        sw.Stop();
        Console.WriteLine($"Random: {sw.ElapsedMilliseconds} ms");

        sw.Reset();
        sw.Start();
        lookUpArray(repetitions, lookupArray, sequentialArray);
        sw.Stop();
        Console.WriteLine($"Sequential: {sw.ElapsedMilliseconds} ms");
    }
}

private static void lookUpArray(int repetitions, int[] lookupArray, int[] indexArray)
{
    for (var r = 0; r < repetitions; r++)
    {
        for (var i = 0; i < indexArray.Length; i++)
        {
            var _ = lookupArray[indexArray[i]];
        }
    }
}

I am no benchmarking expert by any means, so this is probably awful in many ways, but I think its a fairer comparison.

InBetween
  • 32,319
  • 3
  • 50
  • 90
  • @JKurcik Yup, I was the first one not to expect this difference. Your benchmark is still giving wrong results, but its biased towards the opposite direction I anticipated. – InBetween Nov 25 '18 at 19:08
  • sorry for deleting my previous comment, I posted it before your edit. I believe your benchmark is much better as it really makes the difference shine. I found that fixing the lookup array makes the difference much less pronounced, but i don't want to go unsafe in my application. And yes, truth is the problem appeared only after the amount of data being processed ramped up recently. – JKurcik Nov 25 '18 at 19:17