Looks like you're looking for an explanation of the algorithm so I'll try to explain it.
You're "rewriting" the array in place. Think of having a cursor traversing the array reading the data, while simultaneously having a cursor to write values behind it. The "read" cursor in this case could be a foreach
loop or a for
loop with an indexer.
Test data:
[12,34,56,34,78]
and we want to remove 34
, we start with both cursors at position 0 (i.e. values[0]
) with the "newLength = 0" representing the length of the "new" array and thus where the "write" cursor currently resides:
[12,34,56,34,78]
^r
^w
newLength: 0
The first element read, 12
, does not match, so we include that element in the "new" array by writing it to the position in the array signified by the current length of the "new" array (to start, the length is 0, so that's values[0]
. In this case, we're writing the same value to that element, so nothing changes and we move the write cursor to the next position by incrementing the length of the "new" array
[12,34,56,34,78]
^r
^w
newLength: 1
Now the next element does match the value to remove so we don't want to include it in the new array. We do this by not increasing the length of the "new" array and leaving that write cursor where it is while the read cursor moves on to the next element:
[12,34,56,34,78]
^r
^w
newLength: 1
If this array was only two elements, we're done, and no values in the array have changed but the returned length is only 1
. As we have more elements, let's continue to see what happens. We read 56
, which does not match the value to remove, so we "write" it at the position specified by the new "length", after which we increment the length:
[12,56,56,34,78]
^r
^w
newLength: 2
Now we read a value that matches, so we skip writing it:
[12,56,56,34,78]
^r
^w
newLength: 2
And the final value doesn't match, so we write it at the position specified by "length" and increment the length:
[12,56,78,34,78]
^r
^w
newLength: 3
The "read" cursor now exceeds the length of the array so we're done. The new "length" value is now 3
since we "wrote" three values to the array. Using the array in conjunction with the newLength
value functionally results in the "new" array [12,56,78]
.
Here's a safe implementation of @TheGeneral's functionally correct but unsafe implementation using pointers:
public static int DoStuffSafely( int[] values, int valueToRemove, int length )
{
var newLength = 0;
// ~13.5% slower than @TheGeneral's unsafe implementation
foreach( var v in values )
{
// if the value matches the value to remove, skip it
if( valueToRemove != v )
{
// newLength tracks the length of the "new" array
// we'll set the value at that index
values[ newLength ] = v;
// then increase the length of the "new" array
++newLength;
// I never use post-increment/decrement operators
// it requires a temp memory alloc that we toss immediately
// which is expensive for this simple loop, adds about 11% in execution time
}
}
return newLength;
}
Edit: for @Richardissimo
values[ newLength ] = v;
00007FFBC1AC8993 cmp eax,r10d
00007FFBC1AC8996 jae 00007FFBC1AC89B0
00007FFBC1AC8998 movsxd rsi,eax
00007FFBC1AC899B mov dword ptr [r8+rsi*4+10h],r11d
++newLength;
00007FFBC1AC89A0 inc eax
values[ newLength++ ] = v;
00007FFBC1AD8993 lea esi,[rax+1]
00007FFBC1AD8996 cmp eax,r10d
00007FFBC1AD8999 jae 00007FFBC1AD89B3
00007FFBC1AD899B movsxd rax,eax
00007FFBC1AD899E mov dword ptr [r8+rax*4+10h],r11d
00007FFBC1AD89A3 mov eax,esi