19

First off, I know my title can be formulated better, but my math classes are so far gone I can't remember the correct words anymore..

I need to do something like this (pseudo c#)

int[] digits1 = new int[10]{0,1,2,3,4,5,6,7,8,9};
int[] digits2 = new int[10]{0,1,2,3,4,5,6,7,8,9};
int result = digits1*digits2

This would be the sum of the product of element[i] of each array.

This obviously doesn't work. Any suggestions towards either a better title or the solution?

EDIT clarification: I know I could loop them both and do the math. Basically I would think there is a better way to do this and I'm looking for it purely out of personal curiousity.

Boris Callens
  • 90,659
  • 85
  • 207
  • 305

6 Answers6

39

With LINQ:

int dotProduct = digits1.Zip(digits2, (d1, d2) => d1 * d2)
                        .Sum();

Zipwill produce a streaming sequence containing the products of corresponding elements from both arrays, which is then summed into an integer with Sum.

Note that this will not fail like it should when the arrays of unequal length, so you probably need to validate the input:

//null checks here

if(digits1.Length != digits2.Length)
   throw new ArgumentException("...");

EDIT: As Jeff M points out,Enumerable.Zipwas only added to the framework in .NET 4.0. In .NET 3.5, you can do this (the idea is only efficient for collections that expose fast indexers):

int dotProduct = Enumerable.Range(0, digits1.Length)
                           .Sum(i => digits1[i] * digits2[i]);

//from Jeff M's comment:
int dotProduct = digits1.Select((n, i) => n * digits2[i])
                        .Sum();
Ani
  • 111,048
  • 26
  • 262
  • 307
11

Solutions with LINQ

int[] digits1 = new int[10]{0,1,2,3,4,5,6,7,8,9};
int[] digits2 = new int[10]{0,1,2,3,4,5,6,7,8,9};

int result1 = digits1.Zip(digits2, (x, y) => x * y).Sum();

int result2 = digits1.Select((x, y) => x * digits2.ElementAt(y)).Sum();

int result3 = digits1.Select((n, i) => n * digits2[i]).Sum();

// Ani answer
int result4 = Enumerable.Range(0, digits1.Length)
    .Sum(i => digits1[i] * digits2[i]);

Performance test 100000 iterations:

Queries
Fn: Result 1       Ticks 135306
Fn: Result 2       Ticks 2470614
Fn: Result 3       Ticks 130034
Fn: Result 4       Ticks 123374

-------------

Fastest
Fn: Result 4       Ticks 123374
Fn: Result 3       Ticks 130034
Fn: Result 1       Ticks 135306
Fn: Result 2       Ticks 2470614
BrunoLM
  • 97,872
  • 84
  • 296
  • 452
9

I wrote a test bench to compare the these methods' times on my machine.

Specs:
Windows 7 Professional 64-bit
Intel Core 2 Quad Q9550 @ 2.83GHz
4x1GiB Corsair Dominator DDR2 1066 (PC2-8500)

using System;
using System.Linq;

namespace Testbench
{
    class Program
    {
        static void Main(string[] args)
        {
            var digits1 = Enumerable.Range(0, 500).ToArray();
            var digits2 = digits1.ToArray(); // create a copy

            Test("Regular Loop", () =>
            {
                int result = 0;
                for (int i = 0; i < digits1.Length; i++)
                {
                    result += digits1[i] * digits2[i];
                }
                return result;
            });

            // using LINQ
            Test("Enumerable \"Loop\"", () => Enumerable.Range(0, digits1.Length).Sum(i => digits1[i] * digits2[i]));
            Test("Using Zip", () => digits1.Zip(digits2, (x, y) => x * y).Sum());
            Test("Using Indexed Select", () => digits1.Select((n, i) => n * digits2[i]).Sum());
            Test("Using Indexed Select with ElementAt", () => digits1.Select((n, i) => n * digits2.ElementAt(i)).Sum());

            // using PLINQ
            Test("Parallel Enumerable \"Loop\"", () => ParallelEnumerable.Range(0, digits1.Length).Sum(i => digits1[i] * digits2[i]));
            Test("Using Parallel Zip", () => digits1.AsParallel().Zip(digits2.AsParallel(), (x, y) => x * y).Sum());
            Test("Using Parallel Indexed Select", () => digits1.AsParallel().Select((n, i) => n * digits2[i]).Sum());
            Test("Using Parallel Indexed Select with ElementAt", () => digits1.AsParallel().Select((n, i) => n * digits2.ElementAt(i)).Sum());

            Console.Write("Press any key to continue . . . ");
            Console.ReadKey(true);
            Console.WriteLine();
        }

        static void Test<T>(string testName, Func<T> test, int iterations = 1000000)
        {
            Console.WriteLine(testName);
            Console.WriteLine("Iterations: {0}", iterations);
            var results = Enumerable.Repeat(0, iterations).Select(i => new System.Diagnostics.Stopwatch()).ToList();
            var timer = System.Diagnostics.Stopwatch.StartNew();
            for (int i = 0; i < results.Count; i++)
            {
                results[i].Start();
                test();
                results[i].Stop();
            }
            timer.Stop();
            Console.WriteLine("Time(ms): {0,3}/{1,10}/{2,8} ({3,10})", results.Min(t => t.ElapsedMilliseconds), results.Average(t => t.ElapsedMilliseconds), results.Max(t => t.ElapsedMilliseconds), timer.ElapsedMilliseconds);
            Console.WriteLine("Ticks:    {0,3}/{1,10}/{2,8} ({3,10})", results.Min(t => t.ElapsedTicks), results.Average(t => t.ElapsedTicks), results.Max(t => t.ElapsedTicks), timer.ElapsedTicks);
            Console.WriteLine();
        }
    }
}

32-bit target:

Regular Loop
Iterations: 1000000
Time(ms):   0/         0/       0 (      1172)
Ticks:      3/  3.101365/     526 (   3244251)

Enumerable "Loop"
Iterations: 1000000
Time(ms):   0/   4.3E-05/      25 (      9054)
Ticks:     24/  24.93989/   69441 (  25052172)

Using Zip
Iterations: 1000000
Time(ms):   0/   2.4E-05/      16 (     16282)
Ticks:     41/ 44.941406/   45395 (  45052491)

Using Indexed Select
Iterations: 1000000
Time(ms):   0/   5.3E-05/      32 (     13473)
Ticks:     34/ 37.165088/   89602 (  37280177)

Using Indexed Select with ElementAt
Iterations: 1000000
Time(ms):   0/   1.5E-05/       6 (    160215)
Ticks:    405/443.154147/   17821 ( 443306156)

Parallel Enumerable "Loop"
Iterations: 1000000
Time(ms):   0/  0.000103/      29 (     17194)
Ticks:     38/ 47.412312/   81906 (  47576133)

Using Parallel Zip
Iterations: 1000000
Time(ms):   0/   9.4E-05/      19 (     21703)
Ticks:     49/ 59.859005/   53200 (  60051081)

Using Parallel Indexed Select
Iterations: 1000000
Time(ms):   0/  0.000114/      27 (     20579)
Ticks:     45/ 56.758491/   75455 (  56943627)

Using Parallel Indexed Select with ElementAt
Iterations: 1000000
Time(ms):   0/   8.1E-05/      19 (     61137)
Ticks:    144/ 168.97909/   53320 ( 169165086)

64-bit target:

Regular Loop
Iterations: 1000000
Time(ms):   0/         0/       0 (       506)
Ticks:      1/  1.254137/    1491 (   1401969)

Enumerable "Loop"
Iterations: 1000000
Time(ms):   0/   2.9E-05/      15 (     10118)
Ticks:     27/ 27.850086/   41954 (  27995994)

Using Zip
Iterations: 1000000
Time(ms):   0/   2.2E-05/      13 (     17089)
Ticks:     45/ 47.132834/   38506 (  47284608)

Using Indexed Select
Iterations: 1000000
Time(ms):   0/   3.1E-05/      12 (     14057)
Ticks:     37/ 38.740923/   33846 (  38897274)

Using Indexed Select with ElementAt
Iterations: 1000000
Time(ms):   0/   3.8E-05/      29 (    117412)
Ticks:    304/324.711279/   82726 ( 324872753)

Parallel Enumerable "Loop"
Iterations: 1000000
Time(ms):   0/   9.9E-05/      28 (     24234)
Ticks:     38/  66.79389/   77578 (  67054956)

Using Parallel Zip
Iterations: 1000000
Time(ms):   0/  0.000111/      24 (     30193)
Ticks:     46/ 83.264037/   69029 (  83542711)

Using Parallel Indexed Select
Iterations: 1000000
Time(ms):   0/   6.5E-05/      20 (     28417)
Ticks:     45/ 78.337831/   56354 (  78628396)

Using Parallel Indexed Select with ElementAt
Iterations: 1000000
Time(ms):   0/   9.2E-05/      16 (     65233)
Ticks:    112/180.154663/   44799 ( 180496754)
Jeff Mercado
  • 129,526
  • 32
  • 251
  • 272
  • Wow, I figured the loop would be faster, but not by that much. The LINQ methods are definitely shorter and arguable simpler to write, but create a lot of overhead. Thanks for running the test. – Alan Geleynse Oct 22 '10 at 02:27
  • Wouldn't the linq methods be faster in a multi threading scenario? – Boris Callens Oct 22 '10 at 13:01
  • 1
    @boris: No it wouldn't. While there are optimization opportunities, it will only use a single thread. If changed to use PLINQ, then yes definitely. I'll make modifications to include them and repost my results. – Jeff Mercado Oct 22 '10 at 17:30
  • @boris: I said "yes definitely," it does have some caveats. If the arrays were sufficiently large, then "yes definitely." Otherwise, it's unnecessary overhead and might not benefit. – Jeff Mercado Oct 22 '10 at 17:46
  • @boris: I'll be updating one more time. (I need to find a better way to measure the times) Restructuring the test function improved the times all around (due to an overhead I didn't anticipate). The numbers are more reasonable. – Jeff Mercado Oct 22 '10 at 20:22
  • Wow. Thanks for all the effort. It seems the regular for loop still is more interesting for my scenario. – Boris Callens Oct 23 '10 at 13:40
8

Very simply, do a loop.

int sum = 0;
for(int i = 0; i < digits1.length && i < digits2.length; i++)
{
    sum += digits1[i] * digits2[i];
}

Boom.

JoshD
  • 12,490
  • 3
  • 42
  • 53
4
int result = 0;
for(int i = 0; i < digits1.length; i++)
{
    result += digits1[i] * digits2[i];
}
Alan Geleynse
  • 24,821
  • 5
  • 46
  • 55
  • Well yes, but isn't there a better algorithm for this? – Boris Callens Oct 21 '10 at 22:09
  • 1
    I don't think so, you need to do at least n multiplications and n additions, and this does that. I believe it is the least amount of work possible. There may be ways to do it with less code, but the work the code does will be the same. – Alan Geleynse Oct 21 '10 at 22:10
  • Well looks like there is a better way to do it, its been too long since I have done C#. But I believe the runtime should be similar for Zip and the loop since the same work needs to be done. – Alan Geleynse Oct 21 '10 at 22:13
  • I agree with the above. The short, fancy ways of doing this work, and they're concise, but they may not be as clear to someone else reading the code later. – JoshD Oct 21 '10 at 22:15
  • Mathematically you are right, but I would think an average processor would have optimizations for working with arrays (matrices) that could be used by the .net framework. I know it is optimization circus, but I'm looking into this purely out of curiosity. – Boris Callens Oct 21 '10 at 22:17
  • 1
    Can someone run a performance test? @BrunoLM ran one for the LINQ solutions, but I would be interested to compare it to the loop and I do not have a C# environment available right now. – Alan Geleynse Oct 21 '10 at 22:20
  • @Alan: I ran a test. [Link](http://stackoverflow.com/questions/3992363/sum-of-products-of-two-arrays-dotproduct/3992840#3992840) – Jeff Mercado Oct 21 '10 at 23:35
  • Just because you *can* do it with LINQ doesn't mean you have to. I prefer loops because it is more readable in this case. In fact you can make a static method, or an extension method called `Dot` and that would make the code even _more_ readable. – John Alexiou Oct 26 '10 at 05:02
  • This **is** the best. Easily translatable into any other language and with best performance since linq gives you some overhead. – Bitterblue Mar 21 '14 at 10:52
0

Even faster is to unroll the loop

Test("Regular Loop", () =>
            {
                int result = 0;
                for (int i = 0; i < digits1.Length; i++)
                {
                    result += digits1[i] * digits2[i];
                }
                return result;
            });
            // This will fail if vectors are not a multiple of 4 in length.
            Test("Unroll 4x", () =>
            {
                int result = 0;
                for (int i = 0; i < digits1.Length; i+=4)
                {
                    result += digits1[i] * digits2[i];
                    result += digits1[i+1] * digits2[i+1];
                    result += digits1[i+2] * digits2[i+2];
                    result += digits1[i+3] * digits2[i+3];
                }
                return result;
            });

            Test("Dynamic unroll", () =>
            {
                int result = 0;
                int limit = (digits1.Length/8)*8;
                int reminderLimit = digits1.Length;

                if (digits1.Length >= 8)
                {
                    for (int i = 0; i < limit; i+=8)
                    {
                        result += digits1[i] * digits2[i];
                        result += digits1[i+1] * digits2[i+1];
                        result += digits1[i+2] * digits2[i+2];
                        result += digits1[i+3] * digits2[i+3];
                        result += digits1[i+4] * digits2[i+4];
                        result += digits1[i+5] * digits2[i+5];
                        result += digits1[i+6] * digits2[i+6];
                        result += digits1[i+7] * digits2[i+7];
                    }

                    reminderLimit = digits1.Length % 8;
                }

                switch(reminderLimit)
                {
                    case 7: {
                                result += digits1[limit] * digits2[limit];
                                result += digits1[limit+1] * digits2[limit+1];
                                result += digits1[limit+2] * digits2[limit+2];
                                result += digits1[limit+3] * digits2[limit+3];
                                result += digits1[limit+4] * digits2[limit+4];
                                result += digits1[limit+5] * digits2[limit+5];
                                result += digits1[limit+6] * digits2[limit+6];
                                break;
                            }

                    case 6: {
                                result += digits1[limit] * digits2[limit];
                                result += digits1[limit+1] * digits2[limit+1];
                                result += digits1[limit+2] * digits2[limit+2];
                                result += digits1[limit+3] * digits2[limit+3];
                                result += digits1[limit+4] * digits2[limit+4];
                                result += digits1[limit+5] * digits2[limit+5];      
                                break;
                            }

                    case 5: {
                                result += digits1[limit] * digits2[limit];
                                result += digits1[limit+1] * digits2[limit+1];
                                result += digits1[limit+2] * digits2[limit+2];
                                result += digits1[limit+3] * digits2[limit+3];
                                result += digits1[limit+4] * digits2[limit+4];
                                break;
                            }

                    case 4: {
                                result += digits1[limit] * digits2[limit];
                                result += digits1[limit+1] * digits2[limit+1];
                                result += digits1[limit+2] * digits2[limit+2];
                                result += digits1[limit+3] * digits2[limit+3];
                                break;
                            }

                    case 3: {
                                result += digits1[limit] * digits2[limit];
                                result += digits1[limit+1] * digits2[limit+1];
                                result += digits1[limit+2] * digits2[limit+2];
                                break;
                            }

                    case 2: {
                                result += digits1[limit] * digits2[limit];
                                result += digits1[limit+1] * digits2[limit+1];
                                break;
                            }

                    case 1: {
                                result += digits1[limit] * digits2[limit];
                                break;
                            }
                    default :
                            {
                                break;
                            }
                }

                return result;
            });

There is a huge difference is running time between debug and release mode of C# code, this is run with release mode:

Regular Loop
Iterations: 1000000
Time(ms):   0/         0/       0 (       596)
Ticks:      1/  2,071213/     455 (   2154248)

Unroll 4x
Iterations: 1000000
Time(ms):   0/     2E-06/       1 (       575)
Ticks:      1/  1,984301/    3876 (   2076105)

Dynamic unroll
Iterations: 1000000
Time(ms):   0/         0/       0 (       430)
Ticks:      1/    1,4635/    3228 (   1554830)

Debug mode:

Regular Loop
Iterations: 1000000
Time(ms):   0/     1E-06/       1 (      1296)
Ticks:      4/  4,529916/    3907 (   4678354)

Unroll 4x
Iterations: 1000000
Time(ms):   0/         0/       0 (       871)
Ticks:      2/  3,048466/     701 (   3145277)

Dynamic unroll
Iterations: 1000000
Time(ms):   0/         0/       0 (       819)
Ticks:      2/  2,858588/    1398 (   2957179)
  • Am I reading the results correctly that in Release the best case difference is 160ms for 1M iterations? Seems like a nice optimization for a super niche situation, but wouldn't recommend this as a go-to solution :) – Boris Callens Jan 02 '19 at 17:28