19

Summary

While processing a large text file, I came across the following (unexpected) performance degradation that I can't explain. My objectives for this question are:

  • Understand what causes the slowdown described below
  • Find out how to initialize large non-primitive arrays quickly

The Problem

  • Array contains non-primitive reference items
    • Not int[] but MyComplexType[]
    • MyComplexType is a class, not a struct
    • MyComplexType contains some string properties
  • Array is pre-allocated
  • Array is large
  • If an item is created and used without being assigned to the array, program is fast
  • If an item is created and then assigned to an array item, program slows down significantly
    • I expected array item assignment to be a simple reference assignment, but this does not seem to the the case based on my results for my test program below

Test program

Consider the following C# program:

namespace Test
{
    public static class Program
    {
        // Simple data structure
        private sealed class Item
        {
            public Item(int i)
            {
                this.Name = "Hello " + i;
                //this.Name = "Hello";
                //this.Name = null;
            }
            public readonly string Name;
        }

        // Test program
        public static void Main()
        {
            const int length = 1000000;
            var items = new Item[length];

            // Create one million items but don't assign to array
            var w = System.Diagnostics.Stopwatch.StartNew();
            for (var i = 0; i < length; i++)
            {
                var item = new Item(i);
                if (!string.IsNullOrEmpty(item.Name)) // reference the item and its Name property
                {
                    items[i] = null; // do not remember the item
                }
            }
            System.Console.Error.WriteLine("Without assignment: " + w.Elapsed);

            // Create one million items and assign to array
            w.Restart();
            for (var i = 0; i < length; i++)
            {
                var item = new Item(i);
                if (!string.IsNullOrEmpty(item.Name)) // reference the item and its Name property
                {
                    items[i] = item; // remember the item
                }
            }
            System.Console.Error.WriteLine("   With assignment: " + w.Elapsed);
        }
    }
}

It contains two almost-identical loops. Each loop creates one million instances of Item class. First loop uses the created item and then throws away the reference (not persisting it in the items array). Second loop uses the created item and then stores the reference in the items array. Array item assignment is the only difference between the loops.

My Results

  • When I run Release build (optimizations turned on) on my machine, I get the following results:

    Without assignment: 00:00:00.2193348
       With assignment: 00:00:00.8819170
    

    Loop with array assignment is significantly slower than the one without assignment (~4x slower).

  • If I change Item constructor to assign a constant string to Name property:

    public Item(int i)
    {
        //this.Name = "Hello " + i;
        this.Name = "Hello";
        //this.Name = null;
    }
    

    I get the following results:

    Without assignment: 00:00:00.0228067
       With assignment: 00:00:00.0718317
    

    Loop with assignment is still ~3x slower than the one without

  • Finally if I assign null to the Name property:

    public Item(int i)
    {
        //this.Name = "Hello " + i;
        //this.Name = "Hello";
        this.Name = null;
    }
    

    I get the following result:

    Without assignment: 00:00:00.0146696
       With assignment: 00:00:00.0105369
    

    Once no string is allocated, the version without assignment is finally slightly slower (I assume because all those instances are released for garbage collection)

Questions

  • Why is array item assignment slowing the test program so much?

  • Is there an attribute/language construct/etc that will speed up the assignment?

PS: I tried investigating the slowdown using dotTrace, but it was inconclusive. One thing I saw was a lot more string copying and garbage collection overhead in the loop with assignment than in the loop without assignment (even though I expected the reverse).

Milan Gardian
  • 11,326
  • 5
  • 38
  • 45
  • 3
    What's the timing like if you switch the order of the loops? – NG. Nov 05 '13 at 20:10
  • @SB about the same. I ran loop without assignment first so that it absorbs any cold-start overhead. – Milan Gardian Nov 05 '13 at 20:11
  • What's the time difference between a loop that assigns an item to an array vs a loop that assigns each item to 2 different arrays? – mbeckish Nov 05 '13 at 20:11
  • @mbeckish I haven't measured two arrays, I only need to populate one (note that this is also related to other .NET collections like List, because they use arrays under the hood) – Milan Gardian Nov 05 '13 at 20:18
  • My point is that you seem to be implying that the array assignment operation is what takes too long. If you assign to 2 arrays, and it takes significantly longer than assigning to one array, then that might be the answer. However, if the time difference is small, which I expect it might be, then the answer is probably something else, as @SB. suggests. – mbeckish Nov 05 '13 at 20:21
  • 3
    @mbeckish: I tested this, and the time is roughly the same whether you assign to a single array, or to two arrays. So it does not seem to be related to array assignment. – mellamokb Nov 05 '13 at 20:24

6 Answers6

26

I suspect most of the timing issues are related to memory allocation.

When you assign the items into the array, they are never becoming eligible for garbage collection. When you have a string as a property that isn't a constant (interned) or null, this is going to cause your memory allocation requirements to go up.

In the first case, I suspect what's happening is that you're churning through the objects fast, so they stay in Gen0, and can be GCed quickly, and that memory segment can be reused. This means that you're never having to allocate more memory from the OS.

In the second case, you're creating strings within your objects, both of which are two allocations, then storing these so they aren't eligible for GC. At some point, you'll need to get more memory, so you'll get allocated memory.

As for your final check - when you set the Name to null, the if (!string.IsNullOrEmpty(item.Name)) check will prevent it from being added. As such, the two code paths, and therefore the timings, become (effectively) identical, though the first is marginally slower (most likely due to the JIT running the first time).

Reed Copsey
  • 554,122
  • 78
  • 1,158
  • 1,373
  • 9
    This is correct. If you add the line `items[i] = null;` immediately after the line `items[i] = item;` then suddenly the times become almost the same for both loops. I think that this shows that it is the GC and memory handling that is causing the slowdown. – Matthew Watson Nov 05 '13 at 20:28
2

My guess is that the compiler is being really smart and sees you don't need to do anything significant with Item in the case where you're not assigning it. It probably just reuses the Item object memory in the first loop since it can. In the second loop bits of heap need to be allocated since they're all independent and referenced later.

I guess this kind of agrees with what you saw related to garbage collection. One item is created in the first loop vs. many.

A quick note - the first loop is probably using object pooling as it's optimization. This article may provide insight. As Reed is quick to point out, the article talks about app optimizations, but I imagine the allocator itself has a lot of optimizations that do similar things.

NG.
  • 22,560
  • 5
  • 55
  • 61
  • 2
    There is no built-in object pooling - that article was talking about writing a manual pooling algorithm... The framework is going to be creating the object each time in both loops. – Reed Copsey Nov 05 '13 at 20:24
  • Under the covers there are probably a lot of memory re-use optimizations. Java has done similar things, so I imagine C# runtime has as well. – NG. Nov 05 '13 at 20:25
  • The GC will reuse memory segments, to prevent having to request memory from the OS, but it doesn't reuse the objects directly. This is one place where many of the Java runtimes do far more optimizations than the CLR... – Reed Copsey Nov 05 '13 at 20:27
  • So it sounds like the GC is pooling memory segments then? Alright so I misspoke, but the gist is still correct ;) – NG. Nov 05 '13 at 20:31
1

I don't believe this has anything to do (really) with array assignment. It has to do with the amount of time the item and its contained objects have to be kept around, just in case you may later reference them. It's to do with heap allocation and garbage collection generations.

When first allocated the item and it's strings will be in "generation 0". This is often garbage collected and is very hot, maybe even cached, memory. It's very likely that on the next few iterations of the loop the whole "generation 0" will be GC'ed and the memory re-used for new items and their strings. When we add the assignment to the array, the object can not be garbage collected because there is still a reference to it. This causes increased memory consumption.

I believe you will see memory increasing during the execution of your code: I believe that the problem is memory allocations in the heap combined with cache-misses because it is always having to use "fresh" memory and can not benefit from hardware memory caching.

dsz
  • 4,542
  • 39
  • 35
0

To try and solve your actual problem (although this was an interesting puzzle to solve). I would recommend a few things:

  1. Do not store the concatenated strings on construction. Use a get accessor to return the string values. That takes the string concatenation out of the picture when diagnosing array allocation. If you want to "cache" the computed value when you first get it that should be OK.
  2. Run dotTrace against your actual program to get a better idea of where the time is being spent. Since there's nothing you can do to speed up array allocation, you ,ight look for other areas that you can change.
D Stanley
  • 149,601
  • 11
  • 178
  • 240
0

In my opinion you are victim of branch prediction. Let's look in details what you are doing:

In the case "Without assigment", you just assign null to all elements of the array items; by so doing the processor learn, after some iterations of the for loop, that you are assigning the same value (even null) to the array items; hence the if statement is no more needed: your program will run faster.

In the case "With assignment", the processor have no idea of the progression of the new generated items: the if statement is called each iteration of the for loop; this leads to a program running slower...

This behavior rely on a part of the processor hardware called Branch Prediction Unit (consuming a significant proportion of transistors of the chip...) A similar topic is well illustrated here Why is it faster to process a sorted array than an unsorted array?

Community
  • 1
  • 1
moose
  • 11
  • 3
-3

Okay, I'm still looking, but MSDN suggests you use a collection (presumably List<T> or HashTable<T> or something similar) rather than an array. From the MSDN documentation:

Class library designers might need to make difficult decisions about when to use an array and when to return a collection. Although these types have similar usage models, they have different performance characteristics. In general, you should use a collection when Add, Remove, or other methods for manipulating the collection are supported.

Maybe there's something in the .NET specification docs.

AllenG
  • 8,112
  • 29
  • 40