2

In my .NET 4.5.2 C# console app, I have a List<SomeItem> called someItems with SomeItem being:

public class SomeItem    {
    public int A { get; set; }
    public int B { get; set; }
    public DateTime C { get; set; }
}

I then group these items into a new list based on A and B:

var groupedItems = someItems.GroupBy(x => new { x.A, x.B });

I am now looking to introduce C into the GroupBy, but with a catch: I would like to group together SomeItems whose C property is within +- (plus or minus) of 2 seconds from each other (essentially a 2-second offset). So, for example, two SomeItem, one having C set to 2016-03-28 17:58:01.000 and the other to 2016-03-28 17:58:03.000, would be grouped together. Is that possible?

Edit: For all intents and purposes let's assume there won't be items in the list which could cause 'overlapping' groups (like the question by @J. Steen in the comments)

Edit 2: An example

Assuming the following data:

  • 15/06/2017 00:00:02
  • 15/06/2017 00:00:04
  • 15/06/2017 00:00:06
  • 15/06/2017 00:00:09
  • 15/06/2017 00:00:11
  • 15/06/2017 00:00:15

..and an offset of 2 seconds, I would expect them to be grouped in the following manner:

Group 1:

  • 15/06/2017 00:00:02
  • 15/06/2017 00:00:04
  • 15/06/2017 00:00:06

Group 2:

  • 15/06/2017 00:00:09
  • 15/06/2017 00:00:11

Group 3:

  • 15/06/2017 00:00:15
globetrotter
  • 997
  • 1
  • 16
  • 42
  • 3
    What about `2016-03-28 17:58:05`? Doesn't that deserve to be grouped with the one that is 3 seconds past the minute? It's 2 seconds offset from that one, after all. – J. Steen Jun 15 '17 at 13:42
  • That was my initial thought as well, but for all intents and purposes let's assume there won't be items in the list which could cause 'overlapping' groups. – globetrotter Jun 15 '17 at 13:46
  • Let's say my inputs included the following entries, all with the same hour and minute but with different seconds. Their second values are as follows 1,3,8,10,15,17. Based on the question I'd assume the 1 and 3 should be grouped together, the 8 and 10 should be grouped and the 15 and 17 should be grouped. Is that the intent? – mjwills Jun 15 '17 at 14:30
  • @mjwills see my update. – globetrotter Jun 15 '17 at 16:33
  • You maybe able to create an implementation of `iequalitycomparer` – Scrobi Jun 15 '17 at 17:37

4 Answers4

2

Your example input and output are little confusing.

Your first 3 items 2, 4, and 6 have an overlap. 4 can belong with both 2 and 6. This means to decide on the grouping you need to decide which item the groupings will be base on. If you start the groupings with 2 then the results would be

  • Group 1: 2,4
  • Group 2: 6,
  • Group 3: 9, 11
  • Group 4: 15

It seems that your have applied your human brain to see that actually 6 could join the first group if the starting point was 4 seconds. The groups then become:

  • Group 1: 2,4, 6
  • Group 2: 9, 11
  • Group 3: 15

You can create an IEqualityComparer<DateTime> to do this however your grouping is then dependant on the order of the collection:

public class GroupingComparer : IEqualityComparer<DateTime>
{
    private readonly int _offset;

    public GroupingComparer(int offset)
    {
        _offset = offset;
    }

    public bool Equals(DateTime x, DateTime y)
    {
        if (y.Second >= x.Second - _offset && y.Second <= x.Second + _offset) return true;

        return false;
    }

    public int GetHashCode(DateTime obj)
    {
        //Should most probably look at a better way to get the hashcode.
        return obj.ToShortDateString().GetHashCode();
    }
}

using it like this:

GroupingComparer comparer = new GroupingComparer(offset:2);
var result2 = dates.GroupBy(x => x, comparer).ToList(); 

So now it all depends on what you want to do. You can get either of the above outputs by changing the order of the collection. However this could mean strange behaviour in an application if you use different orders at different parts of the application. Maybe an extension method OrderAndGroup could resolve this.

Scrobi
  • 1,215
  • 10
  • 13
  • Hi @Scrobi, "4 can belong with both 2 and 6": yes, this is why I added an example in the OP - I never said we were only talking about pairs of two, but I should have clarified better. – globetrotter Jun 16 '17 at 08:16
  • You state in the question "there won't be items in the list which could cause 'overlapping' groups". Either way you should be able to use the above to create specific `IEqualityComparer` for your `SomeItem`. But as I state in the answer above to get the desired output you would have to reorder the list so that `15/06/2017 00:00:04` was first item in the list, or maybe create some kind of seed time. – Scrobi Jun 16 '17 at 08:37
  • Another point - the above will not group seconds across different minutes. So `15/06/2017 23:59:59` would not be grouped with `16/06/2017 00:00:00` as I am working with the seconds and not subtracting seconds from the `DateTime` – Scrobi Jun 16 '17 at 08:40
  • Thanks for the help @Scrobi, I'll check the solution in VS and report back. – globetrotter Jun 16 '17 at 08:58
  • 1
    The main issue with using IEqualityComparer is that https://msdn.microsoft.com/en-us/library/ms132154(v=vs.110).aspx states that implementations must be `reflexive, symmetric, and transitive. That is, it returns true if used to compare an object with itself; true for two objects x and y if it is true for y and x; and true for two objects x and z if it is true for x and y and also true for y and z.` This is very hard to do, since you need 2 and 6 to be equal, but only if 4 is there (and a IEqualityComparer implementation doesn't know if 4 is there since it looks at only two values at once). – mjwills Jun 16 '17 at 10:57
  • @mjwills good point, what about if the `GroupingComparer` took a seed number in the constructor to build the groups? e.g. seed 4 would group 2 - 6 and a seed 2 would group 0 - 4. In the `Equals` method you can check the two `datetimes` are part of the same group – Scrobi Jun 16 '17 at 11:12
  • That could **work** - although it would be somewhat of a confusing twist on the `IEqualityComparer` concept . :) – mjwills Jun 16 '17 at 11:23
1

You can create an extension method to round the seconds. I found this example: Have datetime.now return to the nearest second

The answer from that question is:

public static DateTime Trim(this DateTime date, long ticks) {
   return new DateTime(date.Ticks - (date.Ticks % ticks), date.Kind);
}

You can then do something like this in the group by:

var result = dates.GroupBy(x => x.Trim(TimeSpan.FromSeconds(3).Ticks));

So give the inputs

  • 15/06/2017 00:00:01
  • 15/06/2017 00:00:02
  • 15/06/2017 00:00:03
  • 15/06/2017 00:00:04
  • 15/06/2017 00:00:05
  • 15/06/2017 00:00:06

You would have 3 groups:

Group A

  1. 15/06/2017 00:00:01
  2. 15/06/2017 00:00:02

Group B

  1. 15/06/2017 00:00:03
  2. 15/06/2017 00:00:04
  3. 15/06/2017 00:00:05

Group C:

  1. 15/06/2017 00:00:06
Scrobi
  • 1,215
  • 10
  • 13
  • This trim would not care about the content of the collection. It rounds the seconds down to the nearest 3 seconds. So 0,1 & 2 would be one group 3,4,5 the second so on and so forth. So if your collection changes the groups would not. – Scrobi Jun 15 '17 at 14:17
  • @Scrobi see my update on the OP where I provide an example of what I expect as output – globetrotter Jun 15 '17 at 16:34
1

The below code should do the trick.

Basically it does the grouping in multiple passes - first on A and B and then it uses TakeWhile and Skip to group the dates in the way that you'd like them to be grouped.

using System;
using System.Collections.Generic;
using System.Linq;

namespace Test
{
    public class SomeItem
    {
        public int A { get; set; }
        public int B { get; set; }
        public DateTime C { get; set; }

        public override string ToString()
        {
            return $"{A} - {B} - {C}";
        }
    }


    public class Program
    {
        static void Main(string[] args)
        {
            // Assuming 2 second margin
            var margin = TimeSpan.FromSeconds(2);

            var input = new List<SomeItem>
            {
                new SomeItem() {A = 1, B = 2, C = new DateTime(2017, 6, 15, 0, 0, 0)},
                new SomeItem() {A = 1, B = 2, C = new DateTime(2017, 6, 15, 0, 0, 2)},
                new SomeItem() {A = 1, B = 2, C = new DateTime(2017, 6, 15, 0, 0, 4)},
                new SomeItem() {A = 1, B = 2, C = new DateTime(2017, 6, 15, 0, 0, 6)},
                new SomeItem() {A = 1, B = 2, C = new DateTime(2017, 6, 15, 0, 0, 9)},
                new SomeItem() {A = 1, B = 2, C = new DateTime(2017, 6, 15, 0, 0, 11)},
                new SomeItem() {A = 1, B = 2, C = new DateTime(2017, 6, 15, 0, 0, 15)},
                new SomeItem() {A = 1, B = 3, C = new DateTime(2017, 6, 15, 0, 0, 2)},
                new SomeItem() {A = 1, B = 3, C = new DateTime(2017, 6, 15, 0, 0, 4)},
                new SomeItem() {A = 1, B = 3, C = new DateTime(2017, 6, 15, 0, 0, 6)},
                new SomeItem() {A = 1, B = 3, C = new DateTime(2017, 6, 15, 0, 0, 9)},
                new SomeItem() {A = 1, B = 3, C = new DateTime(2017, 6, 15, 0, 0, 11)},
                new SomeItem() {A = 1, B = 3, C = new DateTime(2017, 6, 15, 0, 0, 13)}
            };

            var firstGrouping = input.GroupBy(x => new { x.A, x.B });
            var readyForGrouping = new List<Tuple<SomeItem, int>>();

            foreach (var grouping in firstGrouping)
            {
                var data = grouping.OrderBy(z => z.C).ToList();
                var lastDate = default(DateTime?);
                var count = 0;
                var groupingCount = data.Count();
                var groupID = 0;

                while (groupingCount > count)
                {
                    groupID++;
                    readyForGrouping.AddRange(data.Skip(count).TakeWhile(z =>
                    {
                        var old = lastDate;
                        lastDate = z.C;
                        if (old == null)
                        {
                            count++;
                            return true;
                        }
                        if (z.C <= old.Value.Add(margin))
                        {
                            count++;
                            return true;
                        }
                        return false;
                    }).Select(z => new Tuple<SomeItem, int>(z, groupID)).ToList());
                }
            }

            var groupedItems = readyForGrouping.GroupBy(x => new { x.Item1.A, x.Item1.B, x.Item2 },
                x => x.Item1);

            foreach (var grouping in groupedItems)
            {
                Console.WriteLine("Start Of Group");
                foreach (var entry in grouping)
                {
                    Console.WriteLine(entry);

                }
            }
            Console.ReadLine();
        }
    }
}
mjwills
  • 23,389
  • 6
  • 40
  • 63
  • Hi @mjwills, thanks for this. I added `new SomeItem() {A = 1, B = 2, C = new DateTime(2017, 6, 15, 0, 0, 0)},` as a test and was expecting it to fall in the first group (so 4 results in the first group) – globetrotter Jun 16 '17 at 08:42
  • hey @mjwills, no problem, I haven't explained the problem clearly. In your example input, Isn't it the same for "why were 2 and 6 grouped together, if they are 4 seconds apart?" though? Perhaps I didn't understand the answer correctly. Essentially what I'm trying to achieve is, for an offset of e.g. `2` seconds, group all the DateTimes which are *at most* 2 seconds away from each other. (0-2-4-6 qualify, 0-3-6 don't) – globetrotter Jun 16 '17 at 08:51
0

This is not possible, as GroupBy groups identical items together. You could add a new property that "flattens" times.

So say something like public DateTime FlatDate{get{ return C.ToString("yyyy-MM-dd HH:mm");}}

You could alter the return depending on exactly how you want to group, so if second is < 3 set it to 0, < 6 set it to 3 ect.

Not the most elegant, but will do what you want.

KeithN
  • 414
  • 1
  • 6
  • 7
  • I'm aware that some normalization would be needed to group these together - what `C` becomes by normalizing is not important as I'm only interested in the grouping, not the value of `C` for each grouping (after the grouping takes place). I'm just contemplating whether it's possible for this to happen in the `GroupBy` itself, or if a better solution exists. – globetrotter Jun 15 '17 at 13:55
  • 1
    It can happen in the LINQ rather than adding a property. Just use `Select` to do a projection of an anonymous type that includes C.ToString("yyyy-MM-dd HH:mm") - then `GroupBy` on that. – mjwills Jun 15 '17 at 14:08