239

What scenarios would warrant the use of the "Map and Reduce" algorithm?


Is there a .NET implementation of this algorithm?

Developer
  • 17,809
  • 26
  • 66
  • 92

3 Answers3

489

Linq equivalents of Map and Reduce: If you’re lucky enough to have linq then you don’t need to write your own map and reduce functions. C# 3.5 and Linq already has it albeit under different names.

  • Map is Select:

    Enumerable.Range(1, 10).Select(x => x + 2);
    
  • Reduce is Aggregate:

    Enumerable.Range(1, 10).Aggregate(0, (acc, x) => acc + x);
    
  • Filter is Where:

    Enumerable.Range(1, 10).Where(x => x % 2 == 0);
    

https://www.justinshield.com/2011/06/mapreduce-in-c/

jpmc26
  • 28,463
  • 14
  • 94
  • 146
Tony
  • 16,527
  • 15
  • 80
  • 134
  • 1
    The translation is correct but it misses a key point. The shuffle step in map reduce is critical in map-reduce but doesn't show up in the name an one does not have to write any code for it. It is solely driven by the Key that is extracted in the map step. Joel Martinez answer highlights that in my opinion better. – xtofs May 30 '17 at 15:58
  • 56
    **Why oh why** don't they just call it `Reduce` instead of `Aggregate`... MS just likes to annoy programmers – John Henckel Mar 28 '19 at 20:27
  • 45
    @JohnHenckel, I'm definetely not an authoritative source, but I'm pretty sure this comes from SQL. I believe linq was original bought in as a way of making interacting with sql easier in C#. When you're naming functions in that world, aggregate starts to sound a bit more familiar than "reduce" when compared with things like Select and Group By. I'm not saying it's right, it annoys me to no ends but I imagine that's the reason for it. – Elliot Blackburn May 17 '19 at 11:49
  • 26
    @ElliotBlackburn +1, but not only that, I'm pretty sure there was no such function as "reduce" in the standard of JavaScript back when LINQ was first introduced (2007). https://caniuse.com/?search=reduce So we'd better complain about the authors of JavaScript choosing "reduce" over "aggregate" – Kostiantyn Ko Jan 31 '21 at 18:41
  • 8
    @KostiantynKolesnichenko the concept of map / reduce functions and programming model pre-date JavaScript by a long shot. I'm struggling to find a canonical source but they've been in functional programming for many many decades now. So, for once it's not JavaScript's fault and it's actually more standard than C#! – Elliot Blackburn Feb 06 '21 at 12:45
  • If that's really the case then I totally agree the LINQ designers made a terrible mistake and now we have this yet-another-but-different way of doing the same thing, pity :-( probably they just wanted to be unique snowflakes, or they envisioned C# as the bright and only future of programming – Kostiantyn Ko Feb 10 '21 at 17:36
  • For simple total Aggregate, use Sum – Peter Kerr Apr 27 '21 at 08:55
  • Both Reduce and Aggregate are not great name choices. There are many use cases for javascript reduce that are better described using other names. – Ryan Haney May 28 '21 at 06:22
  • @KostiantynKo Please read a very good article that every programmer should read: https://www.joelonsoftware.com/2006/08/01/can-your-programming-language-do-this/ – zumalifeguard May 09 '22 at 02:09
  • 1
    Before taking a jab at the name Aggregate, you may want to do a little research into how much research and development went into creating LINQ, and what the goals of LINQ are. First off, it stands for Language INtegrated Query. So the entire purpose of LINQ was to give programming languages the same flexibility as SQL, and to give it the same feel. The goal was not to introduce functional programming, though, functional programming was used to implement these goals. In SQL there is a function called AGGREGATE(), but there are also SUM() AVG() MIN() MAX() - and those are also available in LINQ – Gregor Jun 25 '22 at 03:37
  • Well, it's Microsoft so backward compatibility trumps progress, I guess. The problem that I have with `Aggregate` is that it doesn't aggregate anything like in `Enumerable.Range(1, 10).Aggregate(0, (x, y) => x > y ? x : y);`. – Michael Jul 04 '23 at 23:25
  • They should at least create aliases for these called `.Map()` and `.Reduce()`, since that's what people are used to in many other languages – Anomaly Jul 07 '23 at 14:48
30

The classes of problem that are well suited for a mapreduce style solution are problems of aggregation. Of extracting data from a dataset. In C#, one could take advantage of LINQ to program in this style.

From the following article: http://codecube.net/2009/02/mapreduce-in-c-using-linq/

the GroupBy method is acting as the map, while the Select method does the job of reducing the intermediate results into the final list of results.

var wordOccurrences = words
                .GroupBy(w => w)
                .Select(intermediate => new
                {
                    Word = intermediate.Key,
                    Frequency = intermediate.Sum(w => 1)
                })
                .Where(w => w.Frequency > 10)
                .OrderBy(w => w.Frequency);

For the distributed portion, you could check out DryadLINQ: http://research.microsoft.com/en-us/projects/dryadlinq/default.aspx

Joel Martinez
  • 46,929
  • 26
  • 130
  • 185
13

Since I never can remember that LINQ calls it Where, Select and Aggregate instead of Filter, Map and Reduce so I created a few extension methods you can use:

IEnumerable<string> myStrings = new List<string>() { "1", "2", "3", "4", "5" };
IEnumerable<int> convertedToInts = myStrings.Map(s => int.Parse(s));
IEnumerable<int> filteredInts = convertedToInts.Filter(i => i <= 3); // Keep 1,2,3
int sumOfAllInts = filteredInts.Reduce((sum, i) => sum + i); // Sum up all ints
Assert.Equal(6, sumOfAllInts); // 1+2+3 is 6

Here are the 3 methods (from https://github.com/cs-util-com/cscore/blob/master/CsCore/PlainNetClassLib/src/Plugins/CsCore/com/csutil/collections/IEnumerableExtensions.cs ):

public static IEnumerable<R> Map<T, R>(this IEnumerable<T> self, Func<T, R> selector) {
    return self.Select(selector);
}

public static T Reduce<T>(this IEnumerable<T> self, Func<T, T, T> func) {
    return self.Aggregate(func);
}

public static IEnumerable<T> Filter<T>(this IEnumerable<T> self, Func<T, bool> predicate) {
    return self.Where(predicate);
}

Some more details from https://github.com/cs-util-com/cscore#ienumerable-extensions :

enter image description here

CsUtil.com
  • 153
  • 1
  • 7