10
  1. I have a series of objects with from and to dates.
  2. Using something like:

    IList<DateTime> dates =
        this.DateRanges
            .SelectMany(r => new [] { r.From, r.To })
            .Distinct()
            .OrderBy(d => d)
            .ToList();
    

    I can get all dates without any of them being duplicated. Ranges may fully overlap, partially overlap (upper or lower overlapping), touch or they may not overlap at all.

  3. Now I need to convert this list to a different one so that each consecutive date pair forms a new generated DateTime instance right in the middle of pair

    D1      D2      D3              D4  D5
        G1      G2          G3        G4
    

    Where Dn are my distinct dates from the list and Gm dates are ones I'd like to generate in the middle of them.

Question

How do I convert an ordered list of individual dates to pairs so that I get pairs as shown in the following example? I would like to form these using LINQ instead of for loop which can accomplish the same thing. Using LINQ may result in and more efficient code due to delayed expression tree execution.

Additional explanation using a real-world example

Suppose this is my example of such ranges:

D1             D2     D3     D4   D5     D6          D11    D12
|--------------|      |------|    |------|           |------|
       D7                         D8
       |--------------------------|
D9                                              D10
|-----------------------------------------------|

First step of getting distinct dates would result in these dates:

D1     D7      D2     D3     D4   D5     D6     D10  D11    D12

D9 and D8 would fall off because they're duplicates.

Next step is to form pairs (I don't know how to do this using LINQ):

D1-D7, D7-D2, D2-D3, D3-D4, D4-D5, D5-D6, D6-D10, (D10-D11), D11-D12

Last step has to calculate a date for each pair using:

Dnew = Dfrom + (Dto - Dfrom)/2

Empty ranges issue

Range D10-D11 should preferably be omitted. But if omitting it results in over-complicates code it can be kept and excluded with a separate check afterwards. But if it can be excluded initially then that's what should be done. So if you also provide information of how to form pairs that exclude empty ranges, you're welcome to add that info as well.

Robert Koritnik
  • 103,639
  • 52
  • 277
  • 404
  • AAhhh I can't comprehend! What happen to dates that aren't consecutive? 1-Jan 5-Jan and 8-Jan 25-Jan... I think you should try to explain what problem you are trying to solve... As it's, it's quite complex because you have overlapping ranges and a range could be a subset of another range. – xanatos Oct 03 '11 at 12:23
  • Nitpick: instead of `....OrderBy(d => d).ToList()` I'd use `....ToList(); dates.Sort();` – sehe Oct 03 '11 at 12:25
  • @Robert (with corrections) How will you handle disconnected sets of dates? Your example has fully overlapping sets of date. There isn't a single day that isn't "covered" between D1 and D10. Try for example adding a D11-D12 to the right of D10, with D11 > D10. – xanatos Oct 03 '11 at 12:35
  • @xanatos: Very good point. I changed my post a bit. – Robert Koritnik Oct 03 '11 at 13:17
  • @KirkBroadhurst I've obviously not stated my case too good, because that's not what I want. I want to form pairs from consecutive items... Calculating middle is too simple to ask here, don't you think? – Robert Koritnik Oct 03 '11 at 13:25
  • @RobertKoritnik have you seen some of the questions asked here? ;) sorry for assuming, I'm happy to be corrected. – Kirk Broadhurst Oct 03 '11 at 13:31
  • @KirkBroadhurst :))) You're right. Some of them really are written by people that took them longer to write the question than to search and find the answer on the net... – Robert Koritnik Oct 04 '11 at 05:24
  • My before/after idea was on the right track, but incomplete - I've now implemented something that works! – AakashM Oct 04 '11 at 09:24

4 Answers4

5

You can use Zip():

var middleDates = dates.Zip(dates.Skip(1), 
                            (a, b) => (a.AddTicks((b - a).Ticks / 2)))
                       .ToList();
BrokenGlass
  • 158,293
  • 28
  • 286
  • 335
  • This is very clever and does exactly what I need. And when not calling `ToList()` I can delay looping ranges until `Skip`. This is very clever. Could you also take a look at the first part of non overlapping ranges because it would be preferable to exclude empty ranges (between D10 and D11) – Robert Koritnik Oct 03 '11 at 13:21
  • @RobertKoritnik: Yes - you could write a custom extension method (or regular iterative code) that only enumerates once - Zip in this case will always enumerate twice - I can take a stab in a little while. Also what exactly do you mean by "empty" date ranges? – BrokenGlass Oct 03 '11 at 13:32
  • Bummer. `Zip` is only available in .net 4+... I need something in 3.5 so I found out [something useful](http://stackoverflow.com/questions/7110762/how-do-i-create-a-single-list-of-object-pairs-from-two-lists-in-c/7110841#7110841)... – Robert Koritnik Oct 03 '11 at 16:04
  • @Robert If you really need the `Zip`, you can look at the source code of Mono :-) – xanatos Oct 04 '11 at 07:43
  • @xanatos I looked at it in .Net directly (Reflector and such...). But I ended up using my own solution that does it all in one go. Check the **final solution** answer and propose improvements if you have any. – Robert Koritnik Oct 04 '11 at 11:03
  • @Robert There is an important point in the fact that I pointed you to Mono instead of telling you to reflect the .NET. You can't simply copy code from the .NET, and simply looking at it "taints" your mind. Looking at the source of Mono and copying its code doesn't taint your code very much (because the license is quite permissive) and doesn't taint very much your mind. – xanatos Oct 04 '11 at 11:39
3

Final solution

Based on the idea of @DavidB and interesting idea by @AakashM's original answer I've come up with my own solution that extracts ranges from a set of dates (while also omitting empty ranges) and calculating range middle dates.

If you have any improvement suggestions or comments on this solution you're warmly welcome to comment on it. Anyway this is the final code I'm using now (inline comments explain its functionality):

// counts range overlaps
int counter = 0;

// saves previous date to calculate midrange date
DateTime left = DateTime.Now;

// get mid range dates
IList<DateTime> dates = this.DateRanges

    // select range starts and ends
    .SelectMany(r => new[] {
        new {
            Date = r.From,
            Counter = 1
        },
        new {
            Date = r.To,
            Counter = -1
        }
    })

    // order dates because they come out mixed
    .OrderBy(o => o.Date)

    // convert dates to ranges; when non-empty & non-zero wide get mid date
    .Select(o => {

        // calculate middle date if range isn't empty and not zero wide
        DateTime? result = null;
        if ((counter != 0) && (left != o.Date))
        {
            result = o.Date.AddTicks(new DateTime((o.Date.Ticks - left.Ticks) / 2).Ticks);
        }

        // prepare for next date range
        left = o.Date;
        counter += o.Counter;

        // return middle date when applicable otherwise null
        return result;
    })

    // exclude empty and zero width ranges
    .Where(d => d.HasValue)

    // collect non nullable dates
    .Select(d => d.Value)
    .ToList();
Robert Koritnik
  • 103,639
  • 52
  • 277
  • 404
1

Next step is to form pairs (I don't know how to do this using LINQ):

        List<DateTime> edges = bucketOfDates
            .Distinct()
            .OrderBy(date => date)
            .ToList();

        DateTime rangeStart = edges.First(); //ps - don't forget to handle empty
        List<DateRange> ranges = edges
            .Skip(1)
            .Select(rangeEnd =>
            {
              DateRange dr = new DateRange(rangeStart, rangeEnd);
              rangeStart = rangeEnd;
              return dr;
            })
            .ToList();
Amy B
  • 108,202
  • 21
  • 135
  • 185
  • +1 This is a nice way to combine LINQ with some regular `foreach` looping code. But it's simple which is great. Thanks for this. It may as well happen, that I'll do it differently and combine some steps together to make the code perform better. – Robert Koritnik Oct 04 '11 at 05:27
  • I added my own answer with my final solution that partly uses your idea of combining a set of dates into ranges. – Robert Koritnik Oct 04 '11 at 11:05
1

OK my previous idea wouldn't work. But this one will. And it's O(n) on the number of inputs.

To solve the D10-D11 problem, we need the process to be aware of how many of the original intervals are 'in effect' at any given date. We can then iterate throw the transition points in order, and emit midpoints whenever we are between two transitions and the current state is ON. Here is complete code.

Data classes:

// The input type
class DateRange
{
    public DateTime From { get; set; }
    public DateTime To { get; set; }
}

// Captures details of a transition point
// along with how many ranges start and end at this point
class TransitionWithCounts
{
    public DateTime DateTime { get; set; }
    public int Starts { get; set; }
    public int Finishes { get; set; }
}

Processing code:

class Program
{
    static void Main(string[] args)
    {
        // Inputs as per question
        var d1 = new DateTime(2011, 1, 1);
        var d2 = new DateTime(2011, 3, 1);
        var d3 = new DateTime(2011, 4, 1);
        var d4 = new DateTime(2011, 5, 1);
        var d5 = new DateTime(2011, 6, 1);
        var d6 = new DateTime(2011, 7, 1);
        var d11 = new DateTime(2011, 9, 1);
        var d12 = new DateTime(2011, 10, 1);
        var d7 = new DateTime(2011, 2, 1);
        var d8 = d5;
        var d9 = d1;
        var d10 = new DateTime(2011, 8, 1);

        var input = new[]
        {
            new DateRange { From = d1, To = d2 },
            new DateRange { From = d3, To = d4 },
            new DateRange { From = d5, To = d6 },
            new DateRange { From = d11, To = d12 },
            new DateRange { From = d7, To = d8 },
            new DateRange { From = d9, To = d10 },
        };

The first step is to capture the starts and finishes of the inputs as transition points. Each original range becomes two transition points, each with a count of 1.

        // Transform into transition points
        var inputWithBeforeAfter = input.SelectMany(
            dateRange => new[]
                {
                    new TransitionWithCounts { DateTime = dateRange.From, Starts = 1 },
                    new TransitionWithCounts { DateTime = dateRange.To, Finishes = 1 }
                });

Now we group these by date, summing up how many of the original ranges started and finished at this date

        // De-dupe by date, counting up how many starts and ends happen at each date
        var deduped = (from bdta in inputWithBeforeAfter
                      group bdta by bdta.DateTime
                      into g
                      orderby g.Key
                      select new TransitionWithCounts
                                 {
                                     DateTime = g.Key,
                                     Starts = g.Sum(bdta => bdta.Starts),
                                     Finishes = g.Sum(bdta => bdta.Finishes)
                                 }
                      );

To process this we could use Aggregate (probably), but it's (for me) far quicker to read and write a manual iteration:

        // Iterate manually since we want to keep a current count
        // and emit stuff
        var output = new List<DateTime>();
        var state = 0;
        TransitionWithCounts prev = null;

        foreach (var current in deduped)
        {
            // Coming to a new transition point
            // If we are ON, we need to emit a new midpoint
            if (state > 0)
            {
                // Emit new midpoint between prev and current
                output.Add(prev.DateTime.AddTicks((current.DateTime - prev.DateTime).Ticks / 2));
            }

            // Update state
            state -= current.Finishes;
            state += current.Starts;

            prev = current;
        }

We could assert that state == 0 at the end, if we felt like it.

        // And we're done
        foreach (var dateTime in output)
        {
            Console.WriteLine(dateTime);
        }

        // 16/01/2011 12:00:00
        // 15/02/2011 00:00:00
        // 16/03/2011 12:00:00
        // 16/04/2011 00:00:00
        // 16/05/2011 12:00:00
        // 16/06/2011 00:00:00
        // 16/07/2011 12:00:00
        // 16/09/2011 00:00:00

        // Note: nothing around 15/08 as that is between D10 and D11,
        // the only midpoint where we are OFF

        Console.ReadKey();
AakashM
  • 62,551
  • 17
  • 151
  • 186
  • This is actually similar to my final solution with less code that I've written based on the idea I got reading your first answer with left and right booleans. Let me put my solution in an answer. – Robert Koritnik Oct 04 '11 at 10:45
  • Actaully my final solution uses @DavidB idea of range generation and counter that works similar to your `Starts` and `Finishes` but in much less and simpler code. I'm not grouping dates (as you do), because there's no need for that, I just make sure I calculate mid points in non-empty and non-zero-width ranges. And that's it. I still reward you with +1 for the effort you've put into this answer as well as the working solution. – Robert Koritnik Oct 04 '11 at 11:13