5

Currently, this is just something I am curious about, I don't have any code I am working on but I am wondering how this could be achieved...

Lets say for example that I have an application that tracks the results of all the football teams in the world. What I want to be able to do is to identify the longest "win" streak for any given team.

I imagine I would most likely have some sort of data table like so:

  • MatchDate datetime
  • TeamA string
  • TeamB string
  • TeamAGoals int
  • TeamBGoals int

So what I would want to do for example is find the longest win streak where TeamA = "My Team" and obviously this would mean TeamAGoals must be greater than TeamBGoals.

As I have said, this is all just for example. It may be better for a different DB design for something like this. But the root question is how to calculate the longest streak/run of matching results.

musefan
  • 47,875
  • 21
  • 135
  • 185
  • 2
    If you want the "most efficient" you probably don't want to use LINQ in the first place. Personally I value ease of understanding and maintainability over efficiency, unless performance becomes an issue. There's no point in breaking your back improving the performance if you don't need it. – Doctor Jones Oct 16 '12 at 10:45
  • @DoctaJonez: A fair point, I just wanted to try to encourage answers to consider performance. Imagine if I wanted something like a grid of teams ordered by win streak - performance might then become an issue. However, I do want to stick with LINQ on this one, so any efficiency lost by that will have to be accepted on my part ;) – musefan Oct 16 '12 at 10:48
  • Don't forget to add Where TeamB = "My Team" and TeamBGoals > TeamAGoals, or are you after Longest home games streak? or would you're database contain 2 entries for each match? – James Oct 16 '12 at 11:02
  • @JamesB: Yeah good point, I guess getting both all wins, and home wins would be useful but I imagine it should be simply to work one out once I know how to do the other – musefan Oct 16 '12 at 11:43

4 Answers4

4

This is an old question now, but I just had to solve the same problem myself, and thought people might be interested in a fully LINQ implementation of Rawling's LongestStreak extension method. This uses Aggregate with a seed and result selector to run through the list.

    public static int LongestStreak<TSource>(
        this IEnumerable<TSource> source,
        Func<TSource, bool> predicate)
    {
        return source.Aggregate(
            new {Longest = 0, Current = 0},
            (agg, element) => predicate(element) ? 
                new {Longest = Math.Max(agg.Longest, agg.Current + 1), Current = agg.Current + 1} : 
                new {agg.Longest, Current = 0},
            agg => agg.Longest);
    }
NickN
  • 61
  • 4
3

There's no out-of-the-box LINQ method to count streaks, so you'll need a custom LINQy method such as

public static int LongestStreak<TSource>(
    this IEnumerable<TSource> source,
    Func<TSource, bool> predicate)
{
    int longestStreak = 0;
    int currentStreak = 0;
    foreach (TSource s in source)
    {
        if (predicate(s))
            currentStreak++;
        else
        {
            if (currentStreak > longestStreak) longestStreak = currentStreak;
            currentStreak = 0;
        }
    }
    if (currentStreak > longestStreak) longestStreak = currentStreak;
    return longestStreak;
}

Then, to use this, first turn each "match result" into a pair of "team results".

var teamResults = matches.SelectMany(m => new[] {
        new {
            MatchDate = m.MatchDate,
            Team = m.TeamA,
            Won = m.TeamAGoals > m.TeamBGoals },
        new {
            MatchDate = m.MatchDate,
            Team = m.TeamB,
            Won = m.TeamBGoals > m.TeamAGoals }
    });

Group these by team.

var groupedResults = teamResults.GroupBy(r => r.Team);

Then calculate the streaks.

var streaks = groupedResults.Select(g => new
    {
        Team = g.Key,
        StreakLength = g
            // unnecessary if the matches were ordered originally
            .OrderBy(r => r.MatchDate)
            .LongestStreak(r => r.Won)
    });

If you want the longest streak only, use MoreLinq's MaxBy; if you want them all ordered, you can use OrderByDescending(s => s.StreakLength).

Alternatively, if you want to do this in one pass, and assuming matches is already ordered, using the following class

class StreakAggregator<TKey>
{
    public Dictionary<TKey, int> Best = new Dictionary<TKey, int>();
    public Dictionary<TKey, int> Current = new Dictionary<TKey, int>();

    public StreakAggregator<TKey> UpdateWith(TKey key, bool success)
    {
        int c = 0;
        Current.TryGetValue(key, out c);
        if (success)
        {
            Current[key] = c + 1;
        }
        else
        {
            int b = 0;
            Best.TryGetValue(key, out b);
            if (c > b)
            {
                Best[key] = c;
            }
            Current[key] = 0;
        }
        return this;
    }

    public StreakAggregator<TKey> Finalise()
    {
        foreach (TKey k in Current.Keys.ToArray())
        {
            UpdateWith(k, false);
        }
        return this;
    }
}

you can then do

var streaks = teamResults.Aggregate(
    new StreakAggregator<string>(),
    (a, r) => a.UpdateWith(r.Team, r.Won),
    (a)    => a.Finalise().Best.Select(kvp => 
        new { Team = kvp.Key, StreakLength = kvp.Value }));

and OrderBy or whatever as before.

Rawling
  • 49,248
  • 7
  • 89
  • 127
  • You could in theory do this in one pass (assuming the source is ordered) rather than this "group by, then iterate groups" approach, but I've chosen to write a more general "get longest streak" method and use that. – Rawling Oct 16 '12 at 11:17
  • Added a single-pass version, although it's more opaque. – Rawling Oct 16 '12 at 11:34
  • this is a great answer. I will get round to testing this at some point and will accept once I have done so (assuming nobody else beats you by then :P). It would appear that using this with Linq to SQL would not permit translation to SQL and it would be required to first query all the relevant results first (i.e. work with a list in memory)... but perhaps I am wrong? – musefan Oct 16 '12 at 11:51
  • Nah, no way will this translate to SQL, sadly. I think "streak" is a pretty tough concept in SQL - I wouldn't want to code this by hand. – Rawling Oct 16 '12 at 11:52
2

You can get all results of team with single query:

var results = from m in Matches
            let homeMatch = m.TeamA == teamName
            let awayMatch = m.TeamB == teamName
            let hasWon = (homeMatch && m.TeamAGoals > m.TeamBGoals) || 
                         (awayMatch && m.TeamBGoals > m.TeamAGoals)
            where homeMatch || awayMatch
            orderby m.MatchDate
            select hasWon;

Then just do simple calculation of longest streak:

int longestStreak = 0;
int currentStreak = 0;

foreach (var hasWon in results)
{
    if (hasWon)
    {
        currentStreak++;
        if (currentStreak > longestStreak)
            longestStreak = currentStreak;

        continue;
    }

    currentStreak = 0;
}

You can use it as is, extract to method, or create IEnumerable extension for calculating longest sequence in results.

Community
  • 1
  • 1
Sergey Berezovskiy
  • 232,247
  • 41
  • 429
  • 459
2

You could make use of string.Split. Something like this:

int longestStreak = 
    string.Concat(results.Select(r => (r.ours > r.theirs) ? "1" : "0"))
          .Split(new[] { '0' })
          .Max(s => s.Length);

Or, better, create a Split extension method for IEnumerable<T> to avoid the need to go via a string, like this:

public static IEnumerable<IEnumerable<T>> Split<T>(this IEnumerable<T> items, Predicate<T> p)
{
    while (true)
    {
        items = items.SkipWhile(i => !p(i));
        var trueItems = items.TakeWhile (i => p(i)).ToList();
        if (trueItems.Count > 0)
        {
            yield return trueItems;
            items = items.Skip(trueItems.Count);
        }
        else
        {
            break;
        }
    }   
}

You can then simply do this:

int longestStreak = results.Split(r => r.ours > r.theirs).Max(g => g.Count());
Matthew Strawbridge
  • 19,940
  • 10
  • 72
  • 93