2

I've got an IEnumerable<T> containing a list of data elements with consistent intervals in one of the properties:

List<Interval> list = new List<Interval>
            { 
                new Interval{ TIME_KEY = 600},
                new Interval{ TIME_KEY = 605},
                new Interval{ TIME_KEY = 615},
                new Interval{ TIME_KEY = 620},
                new Interval{ TIME_KEY = 630}
            };

How can I query this list (using Linq, preferably), to get a List that looks like this:

 List<Interval> list = new List<Interval>
                { 
                    new Interval{ TIME_KEY = 610},
                    new Interval{ TIME_KEY = 625}
                };

?

EDIT: I will probably know what the interval distance is supposed to be, but if there's a way to determine it by examing the data, that would be a huge bonus!

EDIT: changed to numeric values

Chris McCall
  • 10,317
  • 8
  • 49
  • 80
  • 3
    Do you know what the interval is going to be or is the program supposed to figure that out? – NullUserException Sep 14 '10 at 18:17
  • 1
    Is there a numeric representation of those numbers in the Interval class? – Steve Danner Sep 14 '10 at 18:18
  • 1
    I might be in the minority on this, but this seems to be a good example of how not to use linq. A simple concise `foreach` loop would be easily understood at a glance, but the linq solutions just don't seem to pass the "glance test" to me. Sure, you *can* do it on one line, but *should* you? – Brian Gideon Sep 14 '10 at 20:07

6 Answers6

3

An efficient and simple way would be just to go through that list with foreach and detect the gaps.
I assume that 5 minute tact is fixed?

To use LINQ you could create the full list and find the difference, but that seems overkill.


Considering the 2nd part, determining the interval:

From your example a sample of 3 or 4 values would probably do. But you can not be absolutely sure even after examining all the values. Your example data does not exclude a 1 minute frequency with a lot of missing values.

So you need very good specifications regarding this part.

H H
  • 263,252
  • 30
  • 330
  • 514
  • I was about to suggest the second approach... will edit my answer with code. – Jon Skeet Sep 14 '10 at 18:23
  • +1 for not suggesting going directly to linq for something that is tricky and hard to grasp in linq, but simple and easy to understand with a normal iteration. – Jimmy Hoffa Sep 14 '10 at 18:56
3

Have a look at this question for an extension method which selects consecutive values. From there, you could do something like:

// I'd probably rename SelectBetween to SelectConsecutive
list.SelectConsecutive((x, y) => new { Original = x, Interval = y - x})
    .Where(pair => pair.Interval != 5)
    .Select(pair => new Interval(pair.Original + 5));

(Somewhat pseudocode, but I hope you see where I'm going.)

However, that would only generate one element when it's missing... if you went from 0 to 20, it wouldn't generate 5, 10, 15.

To put some meat on Henk's second suggestion:

var missing = Enumerable.Range(0, expectedElementCount)
                        .Select(x => new Interval(baseInterval + 5 * x)
                        .Except(list);
Community
  • 1
  • 1
Jon Skeet
  • 1,421,763
  • 867
  • 9,128
  • 9,194
  • Isn't SelectConsecutive much like scan from Haskell? If so can't it implemented with Zip and Skip: `xs.Zip(xs.Skip(1), f)`? – R. Martinho Fernandes Sep 14 '10 at 18:29
  • In your second code snippet, I'm having a hard time figuring out what "base" is supposed to represent (it's also a reserved keyword). Can you please clarify a litte? – Chris McCall Sep 14 '10 at 18:39
  • I guess `base` is the first element in the list, aka `var @base = list.First()`. – R. Martinho Fernandes Sep 14 '10 at 18:43
  • @Martinho: Yes, the difference being that SelectConsecutive will only scan the list once. I don't like going through it twice which Zip/Skip does. – Jon Skeet Sep 14 '10 at 18:56
  • @Chris: Yes, base was the "base" interval from which everything else is built. Renamed it to avoid the keyword clash :) – Jon Skeet Sep 14 '10 at 18:56
  • This has to be the first time I don't like your answer, maybe it's my feeble mind but I think it can be done more readably/simply. It's like watching your favorite superhero get beaten by a drunken mugger, a sad day.. – Jimmy Hoffa Sep 14 '10 at 19:04
  • @Jimmy: I think the second approach is reasonably simple, isn't it? I must admit I don't quite understand your answer :( – Jon Skeet Sep 14 '10 at 19:23
  • @Jon Skeet: Aye, my answer may not be the easiest to understand either, it's just what came to mind immediately after realizing Henk already mentioned doing a non-linq foreach which I would think the simplest/most readable way of identifying the missing interval elements. As for the second part of your answer, I really don't understand either parts of your answer – Jimmy Hoffa Sep 14 '10 at 19:43
  • @Jimmy: I believe the second part of my answer is broadly like yours - generate the complete list, then remove existing ones. The first part is meant to spot missing elements just by comparing consecutive elements. – Jon Skeet Sep 15 '10 at 14:04
3
var newList = 
     Enumerable.Range(0, 6)
               .Select(r=> new Interval() {TIME_KEY = ((r*5)+600) })
               .Except(list )
James Curran
  • 101,701
  • 37
  • 181
  • 258
2

This would work if the interval is known, if you have access to the Zip method (comes with .NET 4):

list.Zip(list.Skip(1), (x,y) => new { x, delta = y - x })
    .SelectMany(a => Enumerable.Range(1, a.delta/interval - 1)
                               .Select(i => a.x + i*interval));

Note that this iterates the list twice so in case the source is a lazy enumerable, you need to buffer it first. That construction with Zip and Skip is a quick and dirty way of projecting consecutive elements together. Reactive Extensions' System.Interactive library has a Scan method for that and Jon showed a possible implementation in another answer. Neither of those iterates the list twice, so they would be a much better choice.

If the interval is to be determined you can get the minimum delta:

var deltas = list.Zip(list.Skip(1), (x,y) => y - x );
var interval = deltas.Min();
list.Zip(deltas, (x, delta) => new { x, delta })
    .SelectMany(a => Enumerable.Range(1, a.delta/interval - 1)
                               .Select(i => a.x + i*interval));

There are some assumptions I made though:

  • all differences between the elements are multiples of the interval;
  • the input is sorted.

How it works:

  1. First it builds a sequence of pairs with each element but the last and the interval to the element that follows it;
  2. Then for each of those pairs, it produces all the missing values within the delta: within each delta there are exactly a.delta/interval - 1 values, and each of these is a certain number of intervals away from the element store in the pair, hence a.x + i*interval.
  3. SelectMany takes care of flattening all those sequences of missing values together into one.
Community
  • 1
  • 1
R. Martinho Fernandes
  • 228,013
  • 71
  • 433
  • 510
  • Your solution is good, but it will only work if the list is sorted (as in the example). – Gabe Sep 14 '10 at 18:45
0

Try this:

private static IEnumerable<Interval> CalculateMissingIntervals(IEnumerable<Interval> list, int step) {
  return list.Zip(list.Skip(1), 
    (i1, i2) => IntervalRange(i1.TIME_KEY + step, i2.TIME_KEY, step)).
    SelectMany(x => x);
}
private static IEnumerable<Interval> IntervalRange(int start, int end, int step) {
  for (var i = start; i < end; i += step) {
    yield return new Interval { TIME_KEY = i };
  }
}

The initial list is assumed to be sorted.

Jordão
  • 55,340
  • 13
  • 112
  • 144
0
IEnumerable<Interval> missingIntervals =
    Enumerable.Range(list.Min(listMember => listMember.TIME_KEY), list.Max(listMember => listMember.TIME_KEY))
              .Where(enumMember => enumMember % intervalDistance == 0)
              .Select(enumMember => new Interval { TIME_KEY = enumMember}
              .Except(list);
Jimmy Hoffa
  • 5,909
  • 30
  • 53