13

There are times when it's helpful to check a non-repeatable IEnumerable to see whether or not it's empty. LINQ's Any doesn't work well for this, since it consumes the first element of the sequence, e.g.

if(input.Any())
{
    foreach(int i in input)
    {
        // Will miss the first element for non-repeatable sequences!
    }
}

(Note: I'm aware that there's no need to do the check in this case - it's just an example! The real-world example is performing a Zip against a right-hand IEnumerable that can potentially be empty. If it's empty, I want the result to be the left-hand IEnumerable as-is.)

I've come up with a potential solution that looks like this:

private static IEnumerable<T> NullifyIfEmptyHelper<T>(IEnumerator<T> e)
{
    using(e)
    {
        do
        {
            yield return e.Current;
        } while (e.MoveNext());
    }
}

public static IEnumerable<T> NullifyIfEmpty<T>(this IEnumerable<T> source)
{
    IEnumerator<T> e = source.GetEnumerator();
    if(e.MoveNext())
    {
        return NullifyIfEmptyHelper(e);
    }
    else
    {
        e.Dispose();
        return null;
    }
}

This can then be used as follows:

input = input.NullifyIfEmpty();
if(input != null)
{
    foreach(int i in input)
    {
        // Will include the first element.
    }
}

I have two questions about this:

1) Is this a reasonable thing to do? Is it likely to be problematic from a performance point of view? (I'd guess not, but worth asking.)

2) Is there a better way of achieving the same end goal?


EDIT #1:

Here's an example of a non-repeatable IEnumerable, to clarify:

private static IEnumerable<int> ReadNumbers()
{
    for(;;)
    {
        int i;
        if (int.TryParse(Console.ReadLine(), out i) && i != -1)
        {
            yield return i;
        }
        else
        {
            yield break;
        }
    }
}

Basically, things which come from user input or a stream, etc.

EDIT #2:

I need to clarify that I'm looking for a solution that preserves the lazy nature of the IEnumerable - converting it to a list or an array can be an answer in certain circumstances, but isn't what I'm after here. (The real-world reason is that the number of items in the IEnumerable may be huge in my case, and it's important not to store them all in memory at once.)

Stuart Golodetz
  • 20,238
  • 4
  • 51
  • 80
  • 1
    That's a clever approach. I'll be interested to hear others' feedback on it, but to me it seems like a fairly elegant way to solve your problem. – Joe White Feb 08 '12 at 14:26
  • Not sure what you mean by "non-repeatable ienumerable". Do you have an example? The foreach in your first code sample should not miss the first element, just read it for a second time. I don't understand in what situation that wouldn't be possible. – Meta-Knight Feb 08 '12 at 14:36
  • @Meta-Knight: I've added an example, hope that helps. – Stuart Golodetz Feb 08 '12 at 14:41
  • @M.Babcock: `Count` has to consume the entire sequence to work (otherwise how do you know how many elements there are?). Also wouldn't work in the case of `ReadNumbers` even if the sequence was repeatable, since you'd have to wait until the user had finished typing in numbers before you could do the check. – Stuart Golodetz Feb 08 '12 at 14:44
  • @Meta-Knight: I mentioned my real-world use case in the question actually (see the comment in brackets). – Stuart Golodetz Feb 08 '12 at 14:49
  • 3
    `IEnumerator` inherits from `IDisposable`, but your code isn't disposing `e`. This can be a significant correctness issue for some enumerators. – Bradley Grainger Feb 08 '12 at 15:47
  • @BradleyGrainger: Good spot, hopefully fixed now. – Stuart Golodetz Feb 08 '12 at 15:50
  • 5
    @M.Babcock: You have a jar entirely full of pennies. Someone asks you "are there any pennies in that jar?". **Do you count them and then compare the answer to zero, or do you see if there is at least one penny in the jar?** – Eric Lippert Feb 08 '12 at 16:16
  • @EricLippert - Point taken. I wasn't sure if the implementation of `Count` would handle his case since every other method I know of won't work. – M.Babcock Feb 08 '12 at 16:21
  • 1
    @StuartGolodetz: It would still need to be disposed in the case that you _don't_ call `NullifyIfEmpty`. – Bradley Grainger Feb 08 '12 at 20:34
  • @Bradley: D'oh :) Thanks again - better now? – Stuart Golodetz Feb 09 '12 at 01:42
  • @Bradley: Thinking about it, I'm slightly concerned about the case when you don't fully enumerate the `IEnumerable` actually - it occurs to me that this works fine for full enumeration but might not `Dispose` properly for partial enumeration. – Stuart Golodetz Feb 09 '12 at 09:29
  • @StuartGolodetz: The C# compiler takes care of this when it implements the state machine for the `NullifyIfEmptyHelper` method. (It creates an `IEnumerable` class with a `GetEnumerator` method that returns an `IEnumerator` object. The `Dispose` method on that object will run the `finally` clause generated from the `using` statement in your code. See http://csharpindepth.com/Articles/Chapter6/IteratorBlockImplementation.aspx for much greater detail.) – Bradley Grainger Feb 09 '12 at 15:24
  • @StuartGolodetz: Regarding partial enumeration, if you're thinking of something like "breaking out of a foreach loop", the C# compiler also generates the right code there. When the flow of execution leaves a `foreach` loop (via `break` or by reaching the end of the enumerated sequence), the enumerator is automatically disposed. – Bradley Grainger Feb 09 '12 at 15:30
  • @Bradley: I guess I was partly concerned about the case where you don't use the enumerable at all, i.e. you just write `input = input.NullifyIfEmpty();` and then never use `input` again. – Stuart Golodetz Feb 09 '12 at 16:32
  • @EricLippert - See [I learned something](http://stackoverflow.com/questions/9287454/check-the-existence-of-a-record-before-inserting-a-new-record/9287497#comment11712220_9287497) (hopefully it still applies). – M.Babcock Feb 15 '12 at 04:18

3 Answers3

3

You could also just read the first element and if it's not null, concatenate this first element with the rest of your input:

var input = ReadNumbers();
var first = input.FirstOrDefault();
if (first != default(int)) //Assumes input doesn't contain zeroes
{
    var firstAsArray = new[] {first};
    foreach (int i in firstAsArray.Concat(input))
    {
        // Will include the first element.
        Console.WriteLine(i);
    }
}

For a normal enumerable, the first element would be repeated twice, but for a non-repeatable enumerable it would work, unless iterating twice is not allowed. Also, if you had such an enumerator:

private readonly static List<int?> Source = new List<int?>(){1,2,3,4,5,6};

private static IEnumerable<int?> ReadNumbers()
{
    while (Source.Count > 0) {
        yield return Source.ElementAt(0);
        Source.RemoveAt(0);
    }
}

Then it would print: 1, 1, 2, 3, 4, 5, 6. The reason being that the first element is consumed AFTER it has been returned. So the first enumerator, stopping at the first element, never has the chance of consuming that first element. But it would be a case of a badly written enumerator, here. If the element is consumed, then returned...

while (Source.Count > 0) {
    var returnElement = Source.ElementAt(0);
    Source.RemoveAt(0);
    yield return returnElement;
}

...you get the expected output of: 1, 2, 3, 4, 5, 6.

Meta-Knight
  • 17,626
  • 1
  • 48
  • 58
  • 4
    And if the first element of the input == 0? – Oded Feb 08 '12 at 15:09
  • 1
    Yeah, this would only really work for reference types. However, it does suggest a more concise version of the OP's solution... – Dan Puzey Feb 08 '12 at 15:11
  • @Oded: If it's a value type, and the default value can appear in the sequence, you could use a nullable type for the enumerable. – Meta-Knight Feb 08 '12 at 15:53
  • In this example it would be as simple as changing my condition to `first != null`, and changing ReadNumbers' return type to `IEnumerable` – Meta-Knight Feb 08 '12 at 15:56
  • More importantly, you're assuming for the non-repeatable enumerable that reading the first element, then reading the whole enumerable, is equivalent to reading the whole enumerable. This is true for the one in the question, but is usually not true. –  Feb 08 '12 at 16:00
  • @hvd: See my edit. Any comments, or examples where this pattern would fail with a correctly implemented non-repeatable enumerable? – Meta-Knight Feb 08 '12 at 17:35
  • @Meta-Knight For any normal enumerable (enumerables normally should be repeatable), you'll include the first element twice. For some other non-repeatable enumerables, trying to enumerate them a second time will cause an exception to be thrown. –  Feb 08 '12 at 18:07
  • @hvd: OK, I get your point. Yes, it would work only for non-repeatable enumerables. – Meta-Knight Feb 08 '12 at 18:39
2

You don't need to complicate it. A regular foreach loop with a single extra bool variable will do the trick.

If you have

if(input.Any())
{
    A
    foreach(int i in input)
    {
        B
    }
    C
}

and you don't want to read input twice, you can change this to

bool seenItem = false;
foreach(int i in input)
{
    if (!seenItem)
    {
        seenItem = true;
        A
    }
    B
}
if (seenItem)
{
    C
}

Depending on what B does, you may be able to avoid the seenItem variable entirely.

In your case, Enumerable.Zip is a fairly basic function that is easily reimplemented, and your replacement function can use something similar to the above.

Edit: You might consider

public static class MyEnumerableExtensions
{
    public static IEnumerable<TFirst> NotReallyZip<TFirst, TSecond>(this IEnumerable<TFirst> first, IEnumerable<TSecond> second, Func<TFirst, TSecond, TFirst> resultSelector)
    {
        using (var firstEnumerator = first.GetEnumerator())
        using (var secondEnumerator = second.GetEnumerator())
        {
            if (secondEnumerator.MoveNext())
            {
                if (firstEnumerator.MoveNext())
                {
                    do yield return resultSelector(firstEnumerator.Current, secondEnumerator.Current);
                    while (firstEnumerator.MoveNext() && secondEnumerator.MoveNext());
                }
            }
            else
            {
                while (firstEnumerator.MoveNext())
                    yield return firstEnumerator.Current;
            }
        }
    }
}
  • +1 this is quite helpful. I did actually write a version of `Zip` that did something like this when I was trying out ideas, but I ended up thinking that it might be better to come up with a more generally reusable solution. – Stuart Golodetz Feb 08 '12 at 18:54
  • 1
    @StuartGolodetz In that case, I don't see how you can come up with a better general solution than what you have already. :) –  Feb 08 '12 at 18:59
1

This is not an efficient solution if the enumeration is long, however it is an easy solution:

var list = input.ToList();
if (list.Count != 0) {
    foreach (var item in list) {
       ...
    }
}
Olivier Jacot-Descombes
  • 104,806
  • 13
  • 138
  • 188
  • It's unfortunately not a lazy solution - if you use it with `ReadNumbers`, it will wait until the user has stopped entering numbers before doing anything. +1 though all the same because I didn't make that clear in the question. – Stuart Golodetz Feb 08 '12 at 16:05