26

I'm trying to prepare data for a graph using LINQ.

The problem that i cant solve is how to calculate the "difference to previous.

the result I expect is

ID= 1, Date= Now, DiffToPrev= 0;

ID= 1, Date= Now+1, DiffToPrev= 3;

ID= 1, Date= Now+2, DiffToPrev= 7;

ID= 1, Date= Now+3, DiffToPrev= -6;

etc...

Can You help me create such a query ?

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;

namespace ConsoleApplication1
{
    public class MyObject
    {
        public int ID { get; set; }
        public DateTime Date { get; set; }
        public int Value { get; set; }
    }

    class Program
    {
        static void Main()
        {
               var list = new List<MyObject>
          {
            new MyObject {ID= 1,Date = DateTime.Now,Value = 5},
            new MyObject {ID= 1,Date = DateTime.Now.AddDays(1),Value = 8},
            new MyObject {ID= 1,Date = DateTime.Now.AddDays(2),Value = 15},
            new MyObject {ID= 1,Date = DateTime.Now.AddDays(3),Value = 9},
            new MyObject {ID= 1,Date = DateTime.Now.AddDays(4),Value = 12},
            new MyObject {ID= 1,Date = DateTime.Now.AddDays(5),Value = 25},
            new MyObject {ID= 2,Date = DateTime.Now,Value = 10},
            new MyObject {ID= 2,Date = DateTime.Now.AddDays(1),Value = 7},
            new MyObject {ID= 2,Date = DateTime.Now.AddDays(2),Value = 19},
            new MyObject {ID= 2,Date = DateTime.Now.AddDays(3),Value = 12},
            new MyObject {ID= 2,Date = DateTime.Now.AddDays(4),Value = 15},
            new MyObject {ID= 2,Date = DateTime.Now.AddDays(5),Value = 18}

        };

            Console.WriteLine(list);   

            Console.ReadLine();
        }
    }
}
Marty
  • 3,485
  • 8
  • 38
  • 69

7 Answers7

73

One option (for LINQ to Objects) would be to create your own LINQ operator:

// I don't like this name :(
public static IEnumerable<TResult> SelectWithPrevious<TSource, TResult>
    (this IEnumerable<TSource> source,
     Func<TSource, TSource, TResult> projection)
{
    using (var iterator = source.GetEnumerator())
    {
        if (!iterator.MoveNext())
        {
             yield break;
        }
        TSource previous = iterator.Current;
        while (iterator.MoveNext())
        {
            yield return projection(previous, iterator.Current);
            previous = iterator.Current;
        }
    }
}

This enables you to perform your projection using only a single pass of the source sequence, which is always a bonus (imagine running it over a large log file).

Note that it will project a sequence of length n into a sequence of length n-1 - you may want to prepend a "dummy" first element, for example. (Or change the method to include one.)

Here's an example of how you'd use it:

var query = list.SelectWithPrevious((prev, cur) =>
     new { ID = cur.ID, Date = cur.Date, DateDiff = (cur.Date - prev.Date).Days) });

Note that this will include the final result of one ID with the first result of the next ID... you may wish to group your sequence by ID first.

Mitch Wheat
  • 295,962
  • 43
  • 465
  • 541
Jon Skeet
  • 1,421,763
  • 867
  • 9,128
  • 9,194
  • This seems like a right answer, but i cant figure how to use it. – Marty Sep 10 '10 at 08:54
  • I guess this one would be more efficient than Branimir's answer, right ? – Marty Sep 10 '10 at 09:03
  • @Martynas: It's more general than Branimir's answer, and more efficient than Felix's. – Jon Skeet Sep 10 '10 at 09:33
  • Cool :) seems i need to deepen my Linq knowledge more that i thought. Thank You. – Marty Sep 10 '10 at 10:09
  • Hi. I Actually have data like "ID, "Color", "Date", "Value" where (id, color and date) are key, according which data should be grouped. but I cant figure out how to group the data and use Your function. Can You help adding a group by clause to this query ? – Marty Nov 19 '10 at 09:28
  • @Martynas: If you want to group by multiple values, just use `group x by new { x.Id, x.Color, x.Date }` – Jon Skeet Nov 19 '10 at 09:40
  • @John - I've updated the list, so it would have 12 elements. and the query adding ".GroupBy(g => new {g.ID, g.Date})." but there's a problem as I've suspected. When query starts calculating the first element with ID=2, what it does is it takes last sequence element with ID=1 and subtracts value from element with ID=2. in this particular case it's 25 - 10 = 15. Is ti possible to ommit this element somehow ? I guess there will be nested queries in this, right ? – Marty Nov 19 '10 at 18:27
  • @Martynas: I'm afraid it's hard to tell exactly what's going on at this point - it feels like it would be better served by asking a new question. – Jon Skeet Nov 19 '10 at 19:47
  • @John Thanks for Your help. Will prep a new example then :) – Marty Nov 20 '10 at 12:58
  • @John, I've put a new example here - http://stackoverflow.com/q/4237480/444149 Would be cool to hear Your Opinion on this. – Marty Nov 21 '10 at 12:18
  • 2
    That's a nice little function Jon; sweet and simple. – Doctor Jones Dec 16 '10 at 12:09
  • Added modified version to not skip first item... see my answer for code - [stackoverflow.com/q/3683105/26307521#26307521](http://stackoverflow.com/q/3683105/26307521#26307521) – Edyn Oct 10 '14 at 20:44
  • @JonSkeet Since you don't like the function name, can I suggest SelectAdjacent? – Millie Smith Feb 03 '17 at 21:14
  • @MillieSmith: I'm not sure that's any better, to be honest... but let's keep thinking :) – Jon Skeet Feb 04 '17 at 08:13
  • `ByFormer` or `ByAdjacent` :) – M.kazem Akhgary Feb 18 '18 at 20:33
  • I call mine `Scan` based on the APL scan operator (i have many variations). Why are you doing `using` with `IEnumerator` that doesn't implment `IDisposable`? – NetMage Mar 15 '18 at 19:43
  • 1
    @NetMage: `IEnumerator` *does* implement `IDisposable`, and you should always use it - just like `foreach` does implicitly. The non-generic version doesn't. – Jon Skeet Mar 15 '18 at 20:14
  • @JonSkeet Thanks - I see that now. My Reference Source spelunking is still weak. – NetMage Mar 15 '18 at 20:22
  • clean, nice, and most important is O(1) – iojancode Nov 23 '18 at 23:55
21

Use index to get previous object:

   var LinqList = list.Select( 
       (myObject, index) => 
          new { 
            ID = myObject.ID, 
            Date = myObject.Date, 
            Value = myObject.Value, 
            DiffToPrev = (index > 0 ? myObject.Value - list[index - 1].Value : 0)
          }
   );
Branimir
  • 4,327
  • 1
  • 21
  • 33
  • @Martynas: Note that this isn't very general purpose though - it only works in scenarios where you can index into the collection. – Jon Skeet Sep 10 '10 at 09:34
  • 3
    @JonSkeet The OP has a list and didn't ask for general purpose, so this a superior answer. – Jim Balter Jun 21 '13 at 04:55
  • 2
    @JimBalter: The purpose of Stack Overflow is to serve more than just the OP's question. Sometimes it makes sense to stick strictly to the bounds of what's required (although I'd at least have formatted this code to avoid scrolling), but other times I think it's helpful to give more generally-useful approaches. – Jon Skeet Jun 21 '13 at 05:47
  • I like it: nice and simple, as the LINQ is supposed to be! @JonSkeet, Your custom operator has enriched my skills, and also provided good example of operating iterator. But myself and my fellow team members would like to have the code as simple and readable as possible. – Michael G Dec 09 '19 at 15:49
  • @MichaelG: Note that this only works when you have random access to the list. It doesn't work with arbitrary `IEnumerable`. (I also personally think it takes longer to understand that this is trying to do something with adjacent elements than a method called `SelectWithPrevious`. But readability is at least somewhat subjective.) – Jon Skeet Dec 09 '19 at 15:51
  • Hi @JonSkeet,  I used _BenchmarkDotNet_, and _.NET 4.8_ to compare the both algorithms: _SelectWithPrevious_ and _SelectWithIndex_. Interestingly, there is indeed favorable perforamnce Ratio of **1 vs 1.43** for the SelectWithPrevious, when there are only 5 elements in the list. However starting from list of 10 items and all the way to lists of up to 1000000 :D, the Ratio starts to be the same, and even better for the _SelectWithIndex_ solution: **1 vs 0.9**. I am very curious about that! – Michael G Dec 11 '19 at 09:47
  • 1
    @MichaelG: I wouldn't particularly expect a significant performance difference - but SelectWithIndex requires the source to be accessible *by index*, whereas SelectWithPrevious doesn't. – Jon Skeet Dec 11 '19 at 09:49
8

In C#4 you can use the Zip method in order to process two items at a time. Like this:

        var list1 = list.Take(list.Count() - 1);
        var list2 = list.Skip(1);
        var diff = list1.Zip(list2, (item1, item2) => ...);
Felix Ungman
  • 502
  • 2
  • 9
7

Modification of Jon Skeet's answer to not skip the first item:

public static IEnumerable<TResult> SelectWithPrev<TSource, TResult>
    (this IEnumerable<TSource> source, 
    Func<TSource, TSource, bool, TResult> projection)
{
    using (var iterator = source.GetEnumerator())
    {
        var isfirst = true;
        var previous = default(TSource);
        while (iterator.MoveNext())
        {
            yield return projection(iterator.Current, previous, isfirst);
            isfirst = false;
            previous = iterator.Current;
        }
    }
}

A few key differences... passes a third bool parameter to indicate if it is the first element of the enumerable. I also switched the order of the current/previous parameters.

Here's the matching example:

var query = list.SelectWithPrevious((cur, prev, isfirst) =>
    new { 
        ID = cur.ID, 
        Date = cur.Date, 
        DateDiff = (isfirst ? cur.Date : cur.Date - prev.Date).Days);
    });
Edyn
  • 2,409
  • 2
  • 26
  • 25
3

Further to Felix Ungman's post above, below is an example of how you can achieve the data you need making use of Zip():

        var diffs = list.Skip(1).Zip(list,
            (curr, prev) => new { CurrentID = curr.ID, PreviousID = prev.ID, CurrDate = curr.Date, PrevDate = prev.Date, DiffToPrev = curr.Date.Day - prev.Date.Day })
            .ToList();

        diffs.ForEach(fe => Console.WriteLine(string.Format("Current ID: {0}, Previous ID: {1} Current Date: {2}, Previous Date: {3} Diff: {4}",
            fe.CurrentID, fe.PreviousID, fe.CurrDate, fe.PrevDate, fe.DiffToPrev)));

Basically, you are zipping two versions of the same list but the first version (the current list) begins at the 2nd element in the collection, otherwise a difference would always differ the same element, giving a difference of zero.

I hope this makes sense,

Dave

2

Yet another mod on Jon Skeet's version (thanks for your solution +1). Except this is returning an enumerable of tuples.

public static IEnumerable<Tuple<T, T>> Intermediate<T>(this IEnumerable<T> source)
{
    using (var iterator = source.GetEnumerator())
    {
        if (!iterator.MoveNext())
        {
            yield break;
        }
        T previous = iterator.Current;
        while (iterator.MoveNext())
        {
            yield return new Tuple<T, T>(previous, iterator.Current);
            previous = iterator.Current;
        }
    }
}

This is NOT returning the first because it's about returning the intermediate between items.

use it like:

public class MyObject
{
    public int ID { get; set; }
    public DateTime Date { get; set; }
    public int Value { get; set; }
}

var myObjectList = new List<MyObject>();

// don't forget to order on `Date`

foreach(var deltaItem in myObjectList.Intermediate())
{
    var delta = deltaItem.Second.Offset - deltaItem.First.Offset;
    // ..
}

OR

var newList = myObjectList.Intermediate().Select(item => item.Second.Date - item.First.Date);

OR (like jon shows)

var newList = myObjectList.Intermediate().Select(item => new 
{ 
    ID = item.Second.ID, 
    Date = item.Second.Date, 
    DateDiff = (item.Second.Date - item.First.Date).Days
});
Jeroen van Langen
  • 21,446
  • 3
  • 42
  • 57
2

Here is the refactored code with C# 7.2 using the readonly struct and the ValueTuple (also struct).

I use Zip() to create (CurrentID, PreviousID, CurrDate, PrevDate, DiffToPrev) tuple of 5 members. It is easily iterated with foreach:

foreach(var (CurrentID, PreviousID, CurrDate, PrevDate, DiffToPrev) in diffs)

The full code:

public readonly struct S
{
    public int ID { get; }
    public DateTime Date { get; }
    public int Value { get; }

    public S(S other) => this = other;

    public S(int id, DateTime date, int value)
    {
        ID = id;
        Date = date;
        Value = value;
    }

    public static void DumpDiffs(IEnumerable<S> list)
    {
        // Zip (or compare) list with offset 1 - Skip(1) - vs the original list
        // this way the items compared are i[j+1] vs i[j]
        // Note: the resulting enumeration will include list.Count-1 items
        var diffs = list.Skip(1)
                        .Zip(list, (curr, prev) => 
                                    (CurrentID: curr.ID, PreviousID: prev.ID, 
                                    CurrDate: curr.Date, PrevDate: prev.Date, 
                                    DiffToPrev: curr.Date.Day - prev.Date.Day));

        foreach(var (CurrentID, PreviousID, CurrDate, PrevDate, DiffToPrev) in diffs)
            Console.WriteLine($"Current ID: {CurrentID}, Previous ID: {PreviousID} " +
                              $"Current Date: {CurrDate}, Previous Date: {PrevDate} " +
                              $"Diff: {DiffToPrev}");
    }
}

Unit test output:

// the list:

// ID   Date
// ---------------
// 233  17-Feb-19
// 122  31-Mar-19
// 412  03-Mar-19
// 340  05-May-19
// 920  15-May-19

// CurrentID PreviousID CurrentDate PreviousDate Diff (days)
// ---------------------------------------------------------
//    122       233     31-Mar-19   17-Feb-19      14
//    412       122     03-Mar-19   31-Mar-19      -28
//    340       412     05-May-19   03-Mar-19      2
//    920       340     15-May-19   05-May-19      10

Note: the struct (especially readonly) performance is much better than that of a class.

Thanks @FelixUngman and @DavidHuxtable for their Zip() ideas!

Michael G
  • 129
  • 10