5

Got some simple code

Int32[] tmpInt = new Int32[32];

            long lStart = DateTime.Now.Ticks;


            Thread t1 = new Thread(new ThreadStart(delegate()
            {
                for (Int32 i = 0; i < 100000000; i++)
                    Interlocked.Increment(ref tmpInt[5]);
            }));

            Thread t2 = new Thread(new ThreadStart(delegate()
            {
                for (Int32 i = 0; i < 100000000; i++)
                    Interlocked.Increment(ref tmpInt[20]);
            }));


            t1.Start();
            t2.Start();

            t1.Join();
            t2.Join();

            Console.WriteLine(((DateTime.Now.Ticks - lStart)/10000).ToString());

This takes ~3 seconds on my core 2 duo. If I change the index in t1 to tmpInt[4], it takes ~5.5 seconds.

Anyway, the first cache line ends at index 4. Being that a cache line is 64bytes and 5 int32s are only 20 bytes, that means there are 44 bytes of metadata and/or padding before the actual array.

Another set of values that I tested where 5 and 21. 5 and 21 take ~3 seconds, but 5 and 20 takes ~5.5 seconds, but that's because index 20 shares the same cache line as index 5 as they're spaced within the same 64 bytes.

So my question is, how much data does .Net reserve before an array and does this amount change between 32bit and 64bit systems?

Thanks :-)

Bengie
  • 1,035
  • 5
  • 10
  • 3
    Seems you draw a lot of firm conclusions from some simple code. Other factors could be involved. – H H Dec 09 '10 at 21:30
  • This is an expected behavior of cache coherency trying to update dirty lines. Also, not only is it an expected behavior of what I'm looking for, but I can't think of another processes that would cause a large change in difference for atomic operations that are within the same 64 bytes. – Bengie Dec 09 '10 at 21:34
  • 5
    Also a comment on the code, use `System.Diagnostics.Stopwatch` to time code. `DateTime` is simply not reliable and a bad habit (even if it gets you close in this example). – Ron Warholic Dec 09 '10 at 21:41

2 Answers2

5

When the CPU attempts to load your array and suffers a cache miss it fetches the block of memory containing your array but not necessarily STARTING with it. .NET makes no guarantees that your array will be cache aligned.

To answer your question, the 44 bytes of padding is mostly other data from the associated page that happened to be in the same cache line.

edit: http://msdn.microsoft.com/en-us/magazine/cc163791.aspx Seems to indicate that an array has 16 bytes of additional storage. 4 bytes are the sync block index, 4 bytes are used for the typehandle metadata, and the rest is the object itself.

As a side comment, it's hard to exactly say that false-sharing is responsible for your delay here. It's likely given the timings but you should use a good profiler to examine the cache-miss rate. If it jumps high for your given case you can be pretty sure you're seeing false-sharing in play.

Ron Warholic
  • 9,994
  • 31
  • 47
  • I did find a machine that I could use with the VS2010 profiler. The sync overhead for the test with more than a 16byte spread was consistently a few percent lower, but not as much as I expected. What I did find was the Kernel Page Fault blocking time was over 10xs higher for the test where I expected cache line sharing. – Bengie Dec 10 '10 at 14:05
  • I find it weird that it was categorized under "page fault", but this was a system with 4GB of ram and all other apps closed. No swapping was occurring. – Bengie Dec 10 '10 at 14:15
  • 1
    Another interesting note is no matter how many times I run, tmpInt[4] seems to be on a different cache line than tmpInt[5]. Memory allocation must always allocate aligned to the beginning of a cache line per allocation. – Bengie Dec 10 '10 at 14:36
  • @Bengie, Agreed. For me, the threshold is tmpInt[16] (int) or [8] (long), no matter what I do. So here the start of the array *data* is perfectly cache-aligned. Next, let's see if .NET tries to store the array metadata in the last cache line if possible, or always in a separate one. – Timo Feb 05 '16 at 08:26
  • Hm, can't seem to find out. GC.GetTotalMemory() simply increases by 4/8 per int/long I add to the array. And we can't pass properties by ref, so I haven't found a way to use Interlocked with the array's properties to see if they reside in the same cache line as an element. Any ideas? – Timo Feb 05 '16 at 08:40
1

In addition to the answer here: https://stackoverflow.com/a/1589806/543814

My tests indicated what I expected, on 32-bit [64-bit]:

  • Sync block: 4B [8B]
  • Array type pointer: 4B [8B]
  • Size of array: 4B [4B]
  • Element type pointer: 4B [8B] (reference arrays only)

In conclusion, there are 4 possibilities:

12 bytes (32-bit value array)
16 bytes (32-bit reference array)
20 bytes (64-bit value array)
28 bytes (64-bit reference array)

Something that I missed in the past: on a 64-bit machine with the project setting 'prefer 32-bit' enabled (default), 32-bit applies!

Community
  • 1
  • 1
Timo
  • 7,992
  • 4
  • 49
  • 67