0

I'm having a bit of trouble trying to process a list of objects which have simple From and To properties which are both DateTimes where I want the result to be a list of the same type of objects which show the ranges where there are overlaps, tbh, I think I've gone a bit code/logic blind now!

For example (please note, dates are in ddMMyyyy format):

TS1: 01/01/2020 to 10/01/2020 
TS2: 08/01/2020 to 20/01/2020 

So in this case I would expect to get 2 objects, both containing the same data:

TSA: 08/01/2020 to 10/01/2020
TSB: 08/01/2020 to 10/01/2020

A more complex example:

TS1: 01/01/2020 to 10/01/2020 
TS2: 08/01/2020 to 20/01/2020 
TS3: 18/01/2020 to 22/01/2020 

So in this case I would expect to get 4 objects, two sets of two containing the same data:

TSA: 08/01/2020 to 10/01/2020
TSB: 08/01/2020 to 10/01/2020
TSC: 18/01/2020 to 20/01/2020
TSD: 18/01/2020 to 20/01/2020

One more example:

TS1: 01/01/2020 to 01/10/2020 
TS2: 01/02/2020 to 01/09/2020 
TS3: 01/03/2020 to 01/04/2020 

So in this case I would expect to get 3 objects, all containing the same data:

TSA: 01/03/2020 to 01/04/2020
TSB: 01/03/2020 to 01/04/2020
TSC: 01/03/2020 to 01/04/2020

I've tried researching an algorithm online, but without any luck to get exactly what I want, or they are SQl based answers.

Any suggestions would be very welcome.

Edit: Just to explain what this is going to be used for so it might make it a bit clearer for some of the commenters below. Each of these date ranges denote a room which in use. This system is meant to report back date ranges when there are no rooms available at all. As I already know the quantity of rooms I can determine if there is any availability from these results and return the no availability date ranges.

I've also edited the expected results after trying some of the answers below

XN16
  • 5,679
  • 15
  • 48
  • 72
  • Related: https://stackoverflow.com/questions/325933/determine-whether-two-date-ranges-overlap – Colin Emonds Oct 01 '20 at 13:17
  • Note that the answers so far are all O(n^2), although this can be solved in O(n) – Colin Emonds Oct 01 '20 at 13:17
  • @cemper93 Could you please explain how? – Ehssan Oct 01 '20 at 13:32
  • @Ehssan: I was wrong. I now realized that you might get n input ranges that all overlap, in which case you get n^2 overlaps. Therefore, in the worst case, it's always O(n^2). However, I will post an answer that has a much better expected case soon. – Colin Emonds Oct 01 '20 at 13:35
  • @Ehssan: I posted my solution as an answer. EDIT: Note that my algorithm as posted is O(n log(n)) even in the best case, however, it could be optimized to O(n) by using a linear-time sorting algorithm such as radixsort. – Colin Emonds Oct 01 '20 at 14:55
  • @XN16 In general, though, while assembling my answer, I noticed that your question is also slightly underspecified. In the case where there is one long date range (say, all of the year 2020) and there's a smaller ranges contained in it (say, March 1st 05:00 to March 2nd 05:00), do you want the entire small range returned? What if there's another, even smaller range (March 1st 06:00 to 07:00) contained in that, should the even smaller range be returned twice, once for each collision? I just assumed so in my answer, but it's not clear from your question. – Colin Emonds Oct 01 '20 at 15:01
  • @cemper93 In response to your two questions, the first one should return the entire smaller range once, for the second one, it should return the even smaller range once. I've added an explanation for the implementation of this question that hopefully will make things a bit clearer. – XN16 Oct 02 '20 at 14:45
  • @XN16: your post says you want to know "the ranges where there are overlaps", but now you say you want to "report back date ranges when there are no rooms available at all". Doesn't the latter imply you want to know the ranges where _all_ inputs overlap, rather than the ranges where there is _any_ overlap? – Colin Emonds Oct 02 '20 at 15:06

3 Answers3

2

The following algorithm calculates the result in O(n log(n)) in the common case, although it is still O(n^2) in the worst case.

First, a record class.

public class DateRange
{
    public DateRange(DateTime from, DateTime to)
    {
        From = from;
        To = to;
    }

    public DateTime From { get; set; }
    public DateTime To { get; set; }
}

My algorithm is as follows. I added some comments to the algorithm, so I hope it is comprehensible. In principle, it exploits the fact that most ranges do (hopefully) not overlap with more than a few other ranges, by processing the input in sorted order, dropping older input entries from consideration once the current input has moved past their end time.

public static IEnumerable<DateRange> FindOverlaps(IList<DateRange> dateRanges)
{
    if (dateRanges.Count < 2)
    {
        return Enumerable.Empty<DateRange>();
    }

    // Sort the input ranges by start time, in ascending order, to process them in that order.
    var orderedRanges = dateRanges.OrderBy(x => x.From).ToList();
    // Keep a list of previously processed values.
    var previousRanges = new List<DateRange>
    {
        orderedRanges.First(),
    };

    var result = new List<DateRange>();
    foreach (var value in orderedRanges.Skip(1))
    {
        var toDelete = new List<DateRange>();
        // Go through all ranges that start before the current one, and pick those among
        // them that end after the current one starts as result values, and also, delete all
        // those that end before the current one starts from the list -- given that the input
        // is sorted, they will never overlap with future input values.
        foreach (var dateRange in previousRanges)
        {
            if (value.From >= dateRange.To)
            {
                toDelete.Add(dateRange);
            }
            else
            {
                result.Add(new DateRange(value.From, value.To < dateRange.To ? value.To : dateRange.To));
            }
        }
        foreach (var candidate in toDelete)
        {
            previousRanges.Remove(candidate);
        }
        previousRanges.Add(value);
    }

    return result;
}

Note that it is possible that all the n values in the input overlap. In this case, there are n*(n-1) overlaps, so the algorithm will necessarily run in O(n^2). However, in the well-formed case where each date range has a low number of overlaps with other date ranges, the complexity will be roughly O(n log(n)), with the expensive operation being the .OrderBy() calls on the input.

One more consideration. Consider you have a list of input values like so:

var example = new[]
{
    new DateRange(new DateTime(2000, 1, 1), new DateTime(2010, 1, 10)),
    new DateRange(new DateTime(2000, 2, 1), new DateTime(2000, 10, 10)),
    new DateRange(new DateTime(2000, 3, 11), new DateTime(2000, 9, 12)),
    new DateRange(new DateTime(2000, 4, 11), new DateTime(2000, 8, 12)),
};

In this case, not only do all the values overlap, they are also contained within one another. My algorithm as posted above will report such regions multiple times (for example, it will return the range from 2000-04-11 to 2000-08-12 three times, because it overlaps three other date ranges). In case you don't want overlapping regions to be reported multiple times like that, you can feed the output of the above function to the following function to filter them down:

public static IEnumerable<DateRange> MergeRanges(IList<DateRange> dateRanges)
{
    var currentOverlap = dateRanges.First();
    var r = new List<DateRange>();
    foreach (var dateRange in dateRanges.Skip(1))
    {
        if (dateRange.From > currentOverlap.To)
        {
            r.Add(currentOverlap);
            currentOverlap = dateRange;
        }
        else
        {
            currentOverlap.To = currentOverlap.To > dateRange.To ? currentOverlap.To : dateRange.To;
        }
    }
    r.Add(currentOverlap);
    return r;
}

This does not affect overall algorithmic complexity, as it's obviously O(n)-ish.

Colin Emonds
  • 765
  • 4
  • 18
  • `Note that it is possible that all the n values in the input overlap` - if that was the case, you'd also have to look for overlaps in your result. – Ehssan Oct 01 '20 at 15:14
  • 1
    Why do you have an `Id` in the record class? – Ehssan Oct 01 '20 at 15:14
  • @Ehssan: thank you, that was from an earlier version of the code. I edited it out. What do you mean by looking for overlaps in the result? I just tested the code with such a pathological case (a Matroshka-like nesting of date ranges), and it gives what I consider to be the correct result (it returns the ranges themselves, repeated as often as there are Matroshka layerings around them). – Colin Emonds Oct 01 '20 at 15:21
  • 1
    I think in that case the method should return one overlap (the one where all ranges overlap, the smallest interval) – Ehssan Oct 01 '20 at 15:24
  • 1
    @Ehssan: I see. I have now provided a solution for that interpretation as well. – Colin Emonds Oct 01 '20 at 15:34
  • This merges the ranges, but I think it should be the intersection of all ranges ;) Take these ranges for example: 2020-01-01 > 2020-01-10 2020-01-08 > 2020-01-20 2020-01-06 > 2020-01-14 It should give 2020-01-08 > 2020-01-10 but actually gives 2020-01-06 > 2020-01-14 – Ehssan Oct 01 '20 at 15:45
  • Well, why would it only give 2020-01-08 > 2020-01-10? That makes no sense to me on a domain level. There is a lot of days outside that range where there is an overlap, and you probably want to know that. Say, you're trying to match up two schedules, for example: you would always want to know all the date ranges where there is a conflict between the schedules, not just some random intersection of problems. – Colin Emonds Oct 01 '20 at 16:23
  • Or maybe you want to know the date ranges, when all members of the team were out of office, or when to schedule a meeting so that all people are free, ... – Ehssan Oct 01 '20 at 16:24
  • How does your solution give any of that? – Colin Emonds Oct 01 '20 at 16:25
  • It does not, I did not extend my implementation to handle that – Ehssan Oct 01 '20 at 16:26
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/222408/discussion-between-cemper93-and-ehssan). – Colin Emonds Oct 02 '20 at 15:02
0

Let's assume you defined a type to store the date ranges like this:

public class DataObject
{
    public DateTime From { get; set; }
    public DateTime To { get; set; }
}

Then you can compare the items in your list to each other to determine if they overlap, and if so return the overlapping period of time (just to point you in the right direction, I did not thoroughly test this algorithm)

public DataObject[] GetOverlaps(DataObject[] objects)
{
    var result = new List<DataObject>();

    if (objects.Length > 1)
    {
        for (var i = 0; i < objects.Length - 1; i++)
        {
            var pivot = objects[i];

            for (var j = i + 1; j < objects.Length; j++)
            {
                var other = objects[j];

                // Do both ranges overlap?
                if (pivot.From > other.To || pivot.To < other.From)
                {
                    // No
                    continue;
                }

                result.Add(new DataObject
                {
                    From = pivot.From >= other.From ? pivot.From : other.From,
                    To = pivot.To <= other.To ? pivot.To : other.To,
                });
            }
        }
    }

    return result.ToArray();
}
Ehssan
  • 739
  • 6
  • 15
0

Some of the requirements around nested ranges and other corner cases are not exactly clear. But here's another algorithm that seems to do what you're after. The usual caveats of limited testing apply of course - could not work at all for corner cases I didn't test.

The algorithm relies on sorting the data first. If you can't do that then this won't work. The algorithm itself is as follows:

    static IEnumerable<DateRange> ReduceToOverlapping(IEnumerable<DateRange> source)
    {
        if (!source.Any())
            yield break;

        Stack<DateRange> stack = new Stack<DateRange>();

        foreach (var r in source)
        {
            while (stack.Count > 0 && r.Start > stack.Peek().End)
                stack.Pop();

            foreach (var left in stack)
            {
                if (left.GetOverlap(r) is DateRange overlap)
                    yield return overlap;
            }

            stack.Push(r);
        }
    }

The DateRange is a simple class to hold the dates you presented. It looks like this:

class DateRange 
{
    public DateRange(DateRange other)
    { this.Start = other.Start; this.End = other.End; }

    public DateRange(DateTime start, DateTime end)
    { this.Start = start; this.End = end; }

    public DateRange(string start, string end)
    {
        const string format = "dd/MM/yyyy";

        this.Start = DateTime.ParseExact(start, format, CultureInfo.InvariantCulture);
        this.End = DateTime.ParseExact(end, format, CultureInfo.InvariantCulture);
    }

    public DateTime Start { get; set; }
    public DateTime End { get; set; }

    public DateRange GetOverlap(DateRange next)
    {
        if (this.Start <= next.Start && this.End >= next.Start)
        {
            return new DateRange(next.Start, this.End < next.End ? this.End : next.End);
        }

        return null;
    }
}

As I mentioned this is used by sorting it first. Example of sorting and calling the method on some test data is here:

    static void Main(string[] _)
    {
        foreach (var (inputdata, expected) in TestData)
        {
            var sorted = inputdata.OrderBy(x => x.Start).ThenBy(x => x.End);
            var reduced = ReduceToOverlapping(sorted).ToArray();

            if (!Enumerable.SequenceEqual(reduced, expected, new CompareDateRange()))
                throw new ArgumentException("failed to produce correct result");
        }

        Console.WriteLine("all results correct");
    }

To Test you'll need a equality comparer and the test data which is here:

class CompareDateRange : IEqualityComparer<DateRange>
{
    public bool Equals([AllowNull] DateRange x, [AllowNull] DateRange y)
    {
        if (null == x && null == y)
            return true;

        if (null == x || null == y)
            return false;

        return x.Start == y.Start && x.End == y.End;
    }

    public int GetHashCode([DisallowNull] DateRange obj)
    {
        return obj.Start.GetHashCode() ^ obj.End.GetHashCode();
    }
}

    public static (DateRange[], DateRange[])[] TestData = new (DateRange[], DateRange[])[]
    {
        (new DateRange[]
            {
                new DateRange("01/01/2020", "18/01/2020"),
                new DateRange("08/01/2020", "17/01/2020"),
                new DateRange("09/01/2020", "15/01/2020"),
                new DateRange("14/01/2020", "20/01/2020"),
            },
         new DateRange[]
            {
                new DateRange("08/01/2020", "17/01/2020"),
                new DateRange("09/01/2020", "15/01/2020"),
                new DateRange("09/01/2020", "15/01/2020"),
                new DateRange("14/01/2020", "15/01/2020"),
                new DateRange("14/01/2020", "17/01/2020"),
                new DateRange("14/01/2020", "18/01/2020"),
            }),
        (new DateRange[]
            {
                new DateRange("01/01/2020", "10/01/2020"),
                new DateRange("08/01/2020", "20/01/2020"),
            },
         new DateRange[]
            {
                new DateRange("08/01/2020", "10/01/2020"),
            }),
        (new DateRange[]
            {
                new DateRange("01/01/2020", "10/01/2020"),
                new DateRange("08/01/2020", "20/01/2020"),
                new DateRange("18/01/2020", "22/01/2020"),
            },
         new DateRange[]
            {
                new DateRange("08/01/2020", "10/01/2020"),
                new DateRange("18/01/2020", "20/01/2020"),
            }),
        (new DateRange[]
            {
                new DateRange("01/01/2020", "18/01/2020"),
                new DateRange("08/01/2020", "10/01/2020"),
                new DateRange("18/01/2020", "22/01/2020"),
            },
         new DateRange[]
            {
                new DateRange("08/01/2020", "10/01/2020"),
                new DateRange("18/01/2020", "18/01/2020"),
            }),
    };
MikeJ
  • 1,299
  • 7
  • 10