1

I have a list of DataPoint objects (read-only) where some have a Value but others are null. I would like to produce a new list of DataPoint objects where any null DataPoint is set to the closest preceding non-null value (to the left). If no non-null values precede the null value then it defaults to 0.

In the example below the first 2 nulls become 0 since no non-null values preceded them and the last two nulls becomes 5 because 5 is the closest non-null value to their left.

    public class DataPoint
    {
        public DataPoint(int inputValue)
        {
            this.Value = inputValue;
        }
        
        public int Value {get;}
    }

Input:

    List<DataPoint> inputList = new List<DataPoint>
            {null, 
             null, 
             new DataPoint(1), 
             new DataPoint(2), 
             new DataPoint(3), 
             null, 
             null, 
             new DataPoint(4), 
             new DataPoint(5), 
             null, 
             null};

Expected Output:

    foreach (var item in outputList)
    {
        Console.WriteLine(item.Value);
    }

    {0, 0, 1, 2, 3, 3, 3, 4, 5, 5, 5}

Can I get some idea on how to achieve this in elegant way in LINQ? thanks

UPDATE: To avoid ambiguity, I've updated inputList to contains null, instead of DataPoint instance with null value.

buntuoba
  • 13
  • 5
  • 1
    The last two nulls converting to 5 doesn't follow your rules, they have no previous non-null value. edit - Wait I might have read that wrong. Yeah. Never mind. – asawyer Oct 28 '20 at 19:17
  • You should add what you tried so far. – Omar Abdel Bari Oct 28 '20 at 19:18
  • Why does it have to be done using linq? It doesn't seem like a scenario to use it – Omar Abdel Bari Oct 28 '20 at 19:24
  • ^ +1, why do you want LINQ? Do you need deferred execution or are you just curious on how to bend LINQ to this use case? – V0ldek Oct 28 '20 at 19:43
  • Can you change your input and output to be valid C# code? – NetMage Oct 28 '20 at 20:20
  • 1
    There is no way to elegantly achieve this in LINQ, and by this I mean existing built-in LINQ methods or syntax. – Lasse V. Karlsen Oct 28 '20 at 20:52
  • @LasseV.Karlsen I'd tend to agree. I was able to get the correct output but it requires extra inner iterations per item. Something like worst case O(n*n). If I actually had this requirement I'd go with NetMage's answer. – asawyer Oct 28 '20 at 21:07
  • @OmarAbdelBari, first thanks for the edit suggestion. I had this 'done in LINQ' question simply because want have it in consistence with my other part of code in the same section, just wonder if other might have better ideas to handle this in LINQ than my brute force thought, which is neither elegant nor efficient. – buntuoba Oct 28 '20 at 21:39
  • @NetMage, thanks your input, yes, I originally put it that way just for illustration purpose without thinking of validity, could be improved. – buntuoba Oct 28 '20 at 21:43
  • @LasseV.Karlsen Depending on how you feel about elegance, I think the outside state variable isn't too bad (though I would use my `Scan` method). – NetMage Oct 29 '20 at 00:27

3 Answers3

2

Use a helper extension method that is a variation of my LINQ implementation of the APL scan operator (like Aggregate but returns intermediate results) that uses a helper function to start the result stream:

// First PrevResult is TRes seedFn(T FirstValue)
// TRes combineFn(TRes PrevResult, T CurValue)
public static IEnumerable<TRes> Scan<T, TRes>(this IEnumerable<T> items, Func<T, TRes> seedFn, Func<TRes, T, TRes> combineFn) {
    using (var itemsEnum = items.GetEnumerator()) {
        if (itemsEnum.MoveNext()) {
            var prev = seedFn(itemsEnum.Current);

            while (itemsEnum.MoveNext()) {
                yield return prev;
                prev = combineFn(prev, itemsEnum.Current);
            }
            yield return prev;
        }
    }
}

You can scan along the initial List<DataPoint> and return the previous result for any nulls, initializing with the first value, or 0 if it is null:

var ans = InputList.Scan(firstDP => firstDP ?? 0, (prevRes, curDP) => curDP ?? prevRes).ToList();

NOTE: If you don't want to use a helper method, and are willing to abuse LINQ a little by having outside state (e.g. a helper variable), you can simply do:

var prevNonNull = new DataPoint(0);
var ans2 = InputList.Select(n => prevNonNull = n ?? prevNonNull).ToList();
NetMage
  • 26,163
  • 3
  • 34
  • 55
  • Your `ScanPairWithHelper` is very nice and perfect for this sort of application. Thumbs up. The `.Take(i)` in my answer is going to scale terribly. – asawyer Oct 28 '20 at 20:45
  • @asawyer Simplified my answer :) After thinking about it, I realized that a helper was overkill - the standard `Scan` the carries the previous result along (much like `Aggregate`) is all that was needed. – NetMage Oct 29 '20 at 00:26
  • 1
    We all need more APL in our lifes. – asawyer Oct 29 '20 at 01:24
  • @NetMage, I was initially also looking for a way to keep track of the previous item (didn't realize it can be simple as your answer 2, thanks), and could you help explain why that isn't your first choice? Performance degradation? – buntuoba Oct 29 '20 at 04:09
  • 1
    @buntuoba it's considered "a bad practice" to have linq methods which mutate state, it's kind of against linq design philosophy. But no performance degradation or anything like that here. – Evk Oct 29 '20 at 14:39
  • @Evk, seems like [this](https://stackoverflow.com/a/807816/7033323) is also a 'bad practice', right? And Eric answered [here](https://stackoverflow.com/a/47398235/7033323) as well probably for the same regard – buntuoba Oct 29 '20 at 16:02
  • 1
    @buntuoba yeah the example from the first linq is really a bad practice, without quotes. I agree with Eric - want to mutate state, use a loop. It's not like loops suddenly became obsolete by introducing linq. However in your example it's not that straightforward, since you want to produce a new collection, in the spirit of linq, just so happens that selection algorithm needs a temporary variable (at least in most straightforward approach) – Evk Oct 29 '20 at 16:26
1

You could try something like this:

 static void Main(string[] args)
    {
        List<int?> inputList = new List<int?>() { null, null, 1, 2, 3, null, null, 4, 5, null, null };
        var result = Enumerable.Range(0, inputList.Count - 1)
            .Select(i => inputList[i] ?? GetPrevious(i))
            .ToList();

        int GetPrevious(int index)
            => index == 0 ? 0 : inputList[index - 1] ?? GetPrevious(index - 1);
    }
Connell.O'Donnell
  • 3,603
  • 11
  • 27
  • 61
1

Assuming that the actual property type of DataPoint.Value is int? instead of int something like this should work.

var outputList = inputList.Select((l,i)=> new DataPoint()
{
    Value = l?.Value ?? inputList.Take(i).LastOrDefault(t=>t?.Value.HasValue ?? false)?.Value ?? 0
});

I haven't checked but I'm sure the performance characteristics are terrible.

Full linqpad -

void Main()
{
    var inputList = new List<DataPoint>()
    {
        null, null, 1, 2, 3, null, null, 4, 5, null, null
    };
    var outputList = inputList.Select((l,i)=> new DataPoint()
    {
        Value = l?.Value ?? inputList.Take(i).LastOrDefault(t=>t?.Value.HasValue ?? false)?.Value ?? 0
    });
    outputList.Dump();
}

public class DataPoint
{
    public int? Value { get; set; }
    //added to make building the inputList easier
    public static implicit operator DataPoint(int? value) => 
        new DataPoint(){ Value = value };
}

Outputs

IEnumerable<DataPoint> (11 items)
0
0
1
2
3
3
4
5
5
5

If the DataPoint.Value is actually int and the inputList contains nulls, not DataPoint instances with null values it needs a small tweak:

var outputList = inputList.Select((l,i)=> new DataPoint()
{
    Value = l?.Value ?? inputList.Take(i).LastOrDefault(t=>t!=null)?.Value ?? 0
});

...
public static implicit operator DataPoint(int? value) 
    => value.HasValue ? new DataPoint(){ Value = value } : (DataPoint)null;
...
asawyer
  • 17,642
  • 8
  • 59
  • 87