1

I have a class like this:

public class TestResults
{
public String TestName {get;set;}
public Int32 StudentID {get;set;}
public Decimal Score {get;set;}
public Date TestTaken {get;set;}
}

So some objects mike look like this:

test.TestName = "Big Important Test";
test.StudentID = 17;
test.Score = 0.75M;
test.TestTaken = "1/1/2015";

tests.add(test);

test.TestName = "Big Important Test";
test.StudentID = 12;
test.Score = 0.89M;
test.TestTaken = "1/1/2015";

tests.add(test);

test.TestName = "Sneaky Pop Quiz in Chemistry";
test.StudentID = 17;
test.Score = 0.97M;
test.TestTaken = "2/1/2015";

tests.add(test);

test.TestName = "Sneaky Pop Quiz in Chemistry";
test.StudentID = 17;
test.Score = 0.97M;
test.TestTaken = "2/1/2015";

tests.add(test);

What I'm trying to determine is something like "For every student, show me students with large jumps in their scores?" I asked a similar question a while back in the dba.stackexchange.com world and have used the LEAD function, but now I'd like to move the logic into C#.

So a concrete question I'd want to code for would be (as an example):

Show me students who've jumped from the 60 and 70 percent range to the 90 range.

I know I can write a rat's nest of loops and branching logic, but was wondering if there are any more elegant and more comprehensive ways of identifying sequences of patterns in LINQ / C# land.

I've heard people talk about F#, but have no practical experience with that. Additionally, I think the "pattern matching" I'm talking about is a bit more involved than some of the simple string-pattern-matching I keep running across.

Community
  • 1
  • 1
Eric
  • 2,273
  • 2
  • 29
  • 44

2 Answers2

1

You can do something like this:

const decimal differenceLimit = 0.05M;

var studentIdsWithJump = tests.GroupBy (g => g.StudentID)
    .Where(g => g.OrderBy(c => c.Score)
                .GroupAdjacentBy((first, second) => 
                        first.Score + differenceLimit < second.Score
                ).Count() > 1
    )
    .Select(g => g.Key);

With the helper method taken from here:

public static class LinqExtensions
{
   public static IEnumerable<IEnumerable<T>> GroupAdjacentBy<T>(this IEnumerable<T> source, Func<T, T, bool> predicate)
   {
       using (var e = source.GetEnumerator())
       {
           if (e.MoveNext())
           {
               var list = new List<T> { e.Current };
               var pred = e.Current;
               while (e.MoveNext())
               {
                   if (predicate(pred, e.Current))
                   {
                       list.Add(e.Current);
                   }
                   else
                   {
                       yield return list;
                       list = new List<T> { e.Current };
                   }
                   pred = e.Current;
               }
               yield return list;
           }
       }
   }
}

This gives you the jumps for all ranges. If you want to narrow it down, you could add a further .Where() for scores > 60, and adjust the differenceLimit accordingly

Community
  • 1
  • 1
Rob
  • 26,989
  • 16
  • 82
  • 98
1

You could use LINQ to get the answer. Here is an example of a way you could do it:

var scores = tests.GroupBy(t => t.StudentID)
    .Select(g => new { StudentID = g.Key, Min = g.Min(i => i.Score), Max = g.Max(i => i.Score) })
    .Where(s => s.Max - s.Min > .20M);

foreach(var score in scores)
    Console.WriteLine("Student: {0} Jump: {1}", score.StudentID, score.Max - score.Min);

The LINQ statement first groups by StudentID. Next it projects the StudentID and Min and Max scores from each group to a new anonymous type. Finally, apply a where condition that only returns items with a "large jump in score". I define "large jump in score" as the difference between max score and min score is greater than .20.

Note: this code will work even when a student has more than 2 scores in the list.

UPDATE:

Since you have updated your post I understand your question better. Here is an updated answer:

var scores = tests.GroupBy(t => t.StudentID)
    .Select(g => new { StudentID = g.Key, Min = g.OrderBy(i => i.Score).First(), Max = g.OrderByDescending(i => i.Score).First() })
    .Where(s => (s.Min.Score >= .60M & s.Min.Score < .80M) & s.Max.Score >= .90M & s.Min.TestTaken < s.Max.TestTaken);

foreach(var score in scores)
    Console.WriteLine("Student: {0} Jump: {1}", score.StudentID, score.Max.Score - score.Min.Score);

This uses a similar approach, but instead of recording the min and max scores in the anonymous type, I record the TestResults instance having the min score and max score. In ther where clause we check that the TestResults having the min score is in the 60-80 range. We check that the TestResults having the max score is in the 90+ range. Finally, we check that the min score occurred on a date before the max one occurred.

Mike Hixson
  • 5,071
  • 1
  • 19
  • 24