5

What is the special case with the foreach/for loop that eliminates bounds checking? Also which bounds checking is it?

leventov
  • 14,760
  • 11
  • 69
  • 98
Joan Venge
  • 315,713
  • 212
  • 479
  • 689

5 Answers5

10

The standard

for(int i = 0; i < array.Length; i++) {
    ...
}

loop is the one that allows the JIT to safely remove array bounds checks (whether the index is within [0..length-1])

The foreach loop over arrays is equivalent to that standard for loop over arrays.

EDIT: As Robert Jeppesen points out:

This will be optimized if the array is local. If the array is accessible from other locations, bounds checking will still be performed. Reference: Array Bounds Check Elimination in the CLR

Thanks! Didn't know that myself.

Glorfindel
  • 21,988
  • 13
  • 81
  • 109
Christian Klauser
  • 4,416
  • 3
  • 31
  • 42
  • Yes, that is correct. A 'for' loop is emitted for a foreach over an array. – leppie Mar 10 '09 at 17:17
  • Thanks. When you said "whether the index is within [0..length-1]", how does the compiler know I haven't used i*2 inside the for loop? Or is the bounds checking inside the "for" line? – Joan Venge Mar 10 '09 at 17:33
  • 1
    I guess, that optimization is only applied to exact occurrences of `array[i]`. Consequently `array[i+1]` most likely will result in a array bounds check. – Christian Klauser Mar 11 '09 at 14:09
  • "A 'for' loop is emitted for a 'foreach' over an array." Not in framework 1.0! :P – Lucas May 19 '09 at 17:45
  • 2
    This will be optimized if the array is local. If the array is accessible from other locations, bounds checking will still be performed. Reference: http://blogs.msdn.com/b/clrcodegeneration/archive/2009/08/13/array-bounds-check-elimination-in-the-clr.aspx – Robert Jeppesen Mar 21 '11 at 09:32
  • Does a private member-array count as "local," since it's not accessible from other locations? – BlueRaja - Danny Pflughoeft May 01 '13 at 12:20
  • @BlueRaja-DannyPflughoeft The article linked says "no" since the field is still accessible from other methods called directly or indirectly in your loop (or from a different thread). Maybe `readonly` would help. But you can always copy the array reference into a local variable. For bounds checking, only the reference needs to be local (array sizes don't change after allocation). – Christian Klauser May 01 '13 at 14:48
7

SealedSun is right. Don't optimize the way you would in C++. JIT is quite smart to do the right thing for you. You can always code the loop in different ways and then inspect the IL code.

        static void Main(string[] args)
        {
            int[] array = new int[100];
00000000  push        edi  
00000001  push        esi  
00000002  push        eax  
00000003  xor         eax,eax 
00000005  mov         dword ptr [esp],eax 
00000008  mov         edx,64h 
0000000d  mov         ecx,79174292h 
00000012  call        49E73198 
00000017  mov         esi,eax 
            int sum = 0;
00000019  xor         edx,edx 
0000001b  mov         dword ptr [esp],edx 
            for(int index = 0; index < array.Length; index++)
0000001e  mov         edi,dword ptr [esi+4] 
00000021  test        edi,edi 
00000023  jle         00000033 
            {
                sum += array[index];
00000025  mov         eax,dword ptr [esi+edx*4+8] 
00000029  add         dword ptr [esp],eax 
            for(int index = 0; index < array.Length; index++)
0000002c  add         edx,1 
0000002f  cmp         edi,edx 
00000031  jg          00000025 
            }

            Console.WriteLine(sum.ToString());
00000033  mov         esi,dword ptr [esp] 
00000036  call        493765F8 
0000003b  push        eax  
0000003c  mov         ecx,esi 
0000003e  xor         edx,edx 
00000040  call        49E83A8B 
00000045  mov         edi,eax 
00000047  mov         edx,88h 
0000004c  mov         ecx,1 
00000051  call        49E731B0 
00000056  mov         esi,eax 
00000058  cmp         dword ptr [esi+70h],0 
0000005c  jne         00000068 
0000005e  mov         ecx,1 
00000063  call        4936344C 
00000068  mov         ecx,dword ptr [esi+70h] 
0000006b  mov         edx,edi 
0000006d  mov         eax,dword ptr [ecx] 
0000006f  call        dword ptr [eax+000000D8h] 
00000075  pop         ecx  
        }
00000076  pop         esi  
00000077  pop         edi  
00000078  ret              

Now if optimize the code the way you would in C++ you get the following:

        static void Main(string[] args)
        {
            int[] array = new int[100];
00000000  push        edi  
00000001  push        esi  
00000002  push        ebx  
00000003  push        eax  
00000004  xor         eax,eax 
00000006  mov         dword ptr [esp],eax 
00000009  mov         edx,64h 
0000000e  mov         ecx,79174292h 
00000013  call        49E73198 
00000018  mov         esi,eax 
            int sum = 0;
0000001a  xor         edx,edx 
0000001c  mov         dword ptr [esp],edx 

            int length = array.Length;
0000001f  mov         ebx,dword ptr [esi+4] 
            for (int index = 0; index < length; index++)
00000022  test        ebx,ebx 
00000024  jle         0000003B 
00000026  mov         edi,dword ptr [esi+4] 
            {
                sum += array[index];
00000029  cmp         edx,edi                  <-- HERE
0000002b  jae         00000082                 <-- HERE
0000002d  mov         eax,dword ptr [esi+edx*4+8] 
00000031  add         dword ptr [esp],eax 
            for (int index = 0; index < length; index++)
00000034  add         edx,1 
00000037  cmp         edx,ebx 
00000039  jl          00000029 
            }

            Console.WriteLine(sum.ToString());
0000003b  mov         esi,dword ptr [esp] 
0000003e  call        493765F8 
00000043  push        eax  
00000044  mov         ecx,esi 
00000046  xor         edx,edx 
00000048  call        49E83A8B 
0000004d  mov         edi,eax 
0000004f  mov         edx,88h 
00000054  mov         ecx,1 
00000059  call        49E731B0 
0000005e  mov         esi,eax 
00000060  cmp         dword ptr [esi+70h],0 
00000064  jne         00000070 
00000066  mov         ecx,1 
0000006b  call        4936344C 
00000070  mov         ecx,dword ptr [esi+70h] 
00000073  mov         edx,edi 
00000075  mov         eax,dword ptr [ecx] 
00000077  call        dword ptr [eax+000000D8h] 
0000007d  pop         ecx  
        }
0000007e  pop         ebx  
0000007f  pop         esi  
00000080  pop         edi  
00000081  ret              
00000082  call        4A12746B 
00000087  int         3    

By the way - here is the same with foreach statement:

        static void Main(string[] args)
        {
            int[] array = new int[100];
00000000  push        edi  
00000001  push        esi  
00000002  push        eax  
00000003  xor         eax,eax 
00000005  mov         dword ptr [esp],eax 
00000008  mov         edx,64h 
0000000d  mov         ecx,79174292h 
00000012  call        49E73198 
00000017  mov         esi,eax 
            int sum = 0;
00000019  xor         edx,edx 
0000001b  mov         dword ptr [esp],edx 
            for(int index = 0; index < array.Length; index++)
0000001e  mov         edi,dword ptr [esi+4] 
00000021  test        edi,edi 
00000023  jle         00000033 
            {
                sum += array[index];
00000025  mov         eax,dword ptr [esi+edx*4+8] 
00000029  add         dword ptr [esp],eax 
            for(int index = 0; index < array.Length; index++)
0000002c  add         edx,1 
0000002f  cmp         edi,edx 
00000031  jg          00000025 
            }

            Console.WriteLine(sum.ToString());
00000033  mov         esi,dword ptr [esp] 
00000036  call        493765F8 
0000003b  push        eax  
0000003c  mov         ecx,esi 
0000003e  xor         edx,edx 
00000040  call        49E83A8B 
00000045  mov         edi,eax 
00000047  mov         edx,88h 
0000004c  mov         ecx,1 
00000051  call        49E731B0 
00000056  mov         esi,eax 
00000058  cmp         dword ptr [esi+70h],0 
0000005c  jne         00000068 
0000005e  mov         ecx,1 
00000063  call        4936344C 
00000068  mov         ecx,dword ptr [esi+70h] 
0000006b  mov         edx,edi 
0000006d  mov         eax,dword ptr [ecx] 
0000006f  call        dword ptr [eax+000000D8h] 
00000075  pop         ecx  
        }
00000076  pop         esi  
00000077  pop         edi  
00000078  ret              

Don't try to optimize your code without numbers. As you can see JIT will do a lot for your if you don't stand in its way. Use profiler before you optimize. ALWAYS.

David Pokluda
  • 10,693
  • 5
  • 28
  • 26
6

See this for details:

http://codebetter.com/blogs/david.hayden/archive/2005/02/27/56104.aspx

Basically, if you have a for loop, and you explicitly refer to IList.Count or Array.Length, the JIT will catch that, and skip the bounds checking. It makes it faster than precomputing the list length.

foreach on a list or array will do the same thing internally, I believe.

Reed Copsey
  • 554,122
  • 78
  • 1,158
  • 1,373
  • What happens if you refer to IList.Count but do an off-by-one error and end up accessing stuff outside the bounds? Or is it smarter than that? – rjh Mar 10 '09 at 16:47
  • 1
    It only optimizes if it's in the for loop. If you refer to list.Count inside the for block, then that bounds checking won't be optimized. ie: for (int i=0;i – Reed Copsey Mar 10 '09 at 17:07
  • I think List bounds are still checked in 'for' loops because its Count can change at anytime ('foreach' throws an exception if collection changes). – Lucas May 19 '09 at 17:44
  • @Lucas: If the collection isn't touched inside the loop, the JIT optimizes this. There are quite a few articles showing this, but you can do the test yourself. – Reed Copsey May 19 '09 at 17:51
  • @Reed:"If the collection isn't touched inside the loop", but what if its touched *outside* the loop? The List could be passed in as an argument and another thread could Add() or Remove() items. The JIT can't make any assumptions or optimizations because the Lists Count can change (unlike an array's) – Lucas May 19 '09 at 19:37
  • @Reed: Thanks, but every article I've found talks about optimizing *array* bounds checking, not Lists. List's indexed property Item/get_Item() checks the index before returning this._list[index], so there is an explicit bounds check anyway even if the internal array access is somehow optimized. And array fields are not bounds-check-optimized anyway, only local or passed-in arrays are. Unless of course List get special treatment from the JIT compiler, and even then it can't apply to *any* IList. – Lucas May 19 '09 at 19:40
0

A foreach loop uses an enumerator, which is a class or structure that handles the looping. The enumerator has a Current property that returns the current item from the collection. That elliminates the use of an index to access the item in the collection, so the extra step to get the item, including bounds checking, is not needed.

Guffa
  • 687,336
  • 108
  • 737
  • 1,005
  • Since framework 1.1, 'foreach' loops over *arrays* become simple 'for' loops to avoid IEnumerator overhead (creating enumerator instance, method-calling, disposing), so bounds-checking can still be optimized away. – Lucas May 19 '09 at 17:54
  • "bounds checking is not needed" Bounds checking is still done inside the enumerator's MoveNext() method, such as when List accesses its internal array through an index. – Lucas May 19 '09 at 17:55
-3

What? I'm not sure if it is even possible to eliminate bounds checking in c#. If you want unmanaged code, then use:

int[] array;
fixed (int * i = array)
{
 while (i++)
  Console.WriteLine("{0}", *i);
}

for example - it doesn't check bounds, and dies terribly. :-)

nothrow
  • 15,882
  • 9
  • 57
  • 104