4

It's a common situation to copy a range from an array. C# supports this operation in various ways, for instance using Array.Copy, but also by Linq's Skip and Take combination.

Starting at .NET 4.0, do the Skip and Take operations still add considerable overhead, or do they recognize (either at compile time or at run time) their target is an array?

To clarify: I am referring to a) the time taken to skip the initial bytes and b) to the combination of Skip-Take-ToArray, which does not suffer from side effects. For instance, new byte[10000].Skip(9999).Take(1).ToArray() could be optimized this way and would (for large n) be noticeably faster.

mafu
  • 31,798
  • 42
  • 154
  • 247
  • 1
    They can't work in O(1) for arrays. How do you expect a lazily evaluated sequence to ever be O(1)? Just imagine if I did a `.Skip(1)` on a 10,000,000 element array - are you expecting a 9,999,999 element memory copy? That would be very inefficient if I then immediately did a `.Take(1)`. – Enigmativity Nov 01 '14 at 02:56
  • @Enigmativity: There's the size of the array... and then there's the argument passed to `Skip`. Anyway, neither `Skip` nor `Take` copy the source collection. – Ben Voigt Nov 01 '14 at 03:53
  • I have reworded the question to remove the misleading statement of O(1). – mafu Nov 01 '14 at 22:50

3 Answers3

3

Skip/Take still operate in O(n) due to how LINQ has to maintain knowledge of how elements even in arrays can be changed and how the objects have to stay aware of the changes.Therefore, while the optimizations to O(1) seems trivial, it is not currently realized.

See: https://edulinq.googlecode.com/hg/posts/40-Optimization.html

Jon Skeet had an excellent blog a few years back discussing these and more in "reimplementing LINQ" (http://codeblog.jonskeet.uk/2011/01/02/reimplementing-linq-to-objects-part-23-take-skip-takewhile-skipwhile/). It's a great read.

Jason W
  • 13,026
  • 3
  • 31
  • 62
2

Since they are not optimized in 4.5 I can confidently say that they are not optimized in 4.0 either

http://referencesource.microsoft.com/#System.Core/System/Linq/Enumerable.cs#83ec6a20321060a1#references

BTW there is no way Take is O(1) even on arrays since it has an output of N (assuming it is iterated of course)

Stilgar
  • 22,354
  • 14
  • 64
  • 101
2

The optimisation that would be nice is if on Skip(k) the first .MoveNext just sets the internal index to k, and I thought the JITter could do this, but it seems not to, for the versions I've tried.

The code to be optimised is:

while (count > 0 && e.MoveNext()) count--;

from SkipIterator and:

      public bool MoveNext() {
            if (_index < _endIndex) {
                _index++;
                return (_index < _endIndex);
            }
            return false;
        }

from SZGenericArrayEnumerator.

(The reference source is 4.5.1, but reflector shows very similar code in 4.0.)

Mark Hurd
  • 10,665
  • 10
  • 68
  • 101
  • That's the question! *Does* the JITter do it? ;) – mafu Nov 01 '14 at 22:55
  • @mafu Short answer, no: [My LinqPad Test](http://share.linqpad.net/fhb96l.linq). It is possible that some JIT optimisations are still off but this is with the LinqPad /o+ option on. (You'll need my [`.Time` My Extension](http://share.linqpad.net/jkuqrc.linq) to replicate the test.) – Mark Hurd Nov 02 '14 at 05:02
  • 1
    Just an update: A number of answers to similar questions eventually link to the same Jon Skeet blog post as Jason W. or [this one](http://stackoverflow.com/a/7288572/256431), but more directly suggest loops cannot (or will not) be removed by optimisation; and, since .NET 4.5, `ArraySegment` becomes an option when you know you're dealing with an array. – Mark Hurd Jul 02 '15 at 02:02