Is there a fast product operation for PackedArrays?

Question

In Mathematica a vector (or rectangular array) containing all machine size integers or floats may be stored in a packed array. These objects take less memory, and some operations are much faster on them.

RandomReal produces a packed array when possible. A packed array can be unpacked with the Developer function FromPackedArray

Consider these timings

lst = RandomReal[1, 5000000];

Total[lst] // Timing
Plus @@ lst // Timing

lst = Developer`FromPackedArray[lst];

Total[lst] // Timing
Plus @@ lst // Timing

Out[1]= {0.016, 2.50056*10^6}

Out[2]= {0.859, 2.50056*10^6}

Out[3]= {0.625, 2.50056*10^6}

Out[4]= {0.64, 2.50056*10^6}

Therefore, in the case of a packed array, Total is many times faster than Plus @@ but about the same for a non-packed array. Note that Plus @@ is actually a little slower on the packed array.

Now consider

lst = RandomReal[100, 5000000];
Times @@ lst // Timing

lst = Developer`FromPackedArray[lst];
Times @@ lst // Timing

Out[1]= {0.875, 5.8324791357*10^7828854}

Out[1]= {0.625, 5.8324791357*10^7828854}

Finally, my question: is there a fast method in Mathematica for the list product of a packed array, analogous to Total?

I suspect that this may not be possible because of the way that numerical errors compound with multiplication. Also, the function will need to be able to return non-machine floats to be useful.

score 9 · Accepted Answer · edited May 23 '17 at 12:04

9

I've also wondered if there was a multiplicative equivalent to Total.

A ~~really~~ not so bad solution is

In[1]:= lst=RandomReal[2,5000000];
        Times@@lst//Timing
        Exp[Total[Log[lst]]]//Timing
Out[2]= {2.54,4.370467929041*10^-666614}
Out[3]= {0.47,4.370467940*10^-666614}

As long as the numbers are positive and aren't too big or small, then the rounding errors aren't too bad. A guess as to what might be happening during evaluation is that: (1) Provided the numbers are positive floats, the Log operation can be quickly applied to the packed array. (2) The numbers can then be quickly added using Total's packed array method. (3) Then it's only the final step where a non-machine sized float need arise.

See this SO answer for a solution that works for both positive and negative floats.

Let's quickly check that this solution works with floats that yield a non-machine sized answer. Compare with Andrew's (much faster) compiledListProduct:

In[10]:= compiledListProduct = 
          Compile[{{l, _Real, 1}}, 
           Module[{tot = 1.}, Do[tot *= x, {x, l}]; tot], 
           CompilationTarget -> "C"]

In[11]:= lst=RandomReal[{0.05,.10},15000000];
         Times@@lst//Timing
         Exp[Total[Log[lst]]]//Timing
         compiledListProduct[lst]//Timing
Out[12]= {7.49,2.49105025389*10^-16998863}
Out[13]= {0.5,2.4910349*10^-16998863}
Out[14]= {0.07,0.}

If you choose larger (>1) reals, then compiledListProduct will yield the warning CompiledFunction::cfne: Numerical error encountered; proceeding with uncompiled evaluation. and will take some time to give a result...

One curio is that both Sum and Product can take arbitrary lists. Sum works fine

In[4]:= lst=RandomReal[2,5000000];
         Sum[i,{i,lst}]//Timing
         Total[lst]//Timing
Out[5]= {0.58,5.00039*10^6}
Out[6]= {0.02,5.00039*10^6}

but for long PackedArrays, such as the test examples here, Product fails since the automatically compiled code (in version 8.0) does not catch underflows/overflows properly:

In[7]:= lst=RandomReal[2,5000000];
         Product[i,{i,lst}]//Timing
         Times@@lst//Timing
Out[8]= {0.,Compile`AutoVar12!}
Out[9]= {2.52,1.781498881673*10^-666005}

The work around supplied by the helpful WRI tech support is to turn off the product compilation using SetSystemOptions["CompileOptions" -> {"ProductCompileLength" -> Infinity}]. Another option is to use lst=Developer`FromPackedArray[lst].

edited May 23 '17 at 12:04

Community

1
1

answered Mar 14 '11 at 05:05

Simon

14,631
4
41
101

Your "really bad solution" may not have good precision, but it **is** much faster, and really may be the best that is possible given the nature of multiplication. If nothing better is posted, this gets my checkmark. +1 Also, on my system (Win32, Mma7) it is about 40X faster than `Times @@` on your example. – Mr.Wizard Mar 14 '11 at 05:36
3

Good idea. The `Log` lets you keep almost all the calculations in machine precision. – Andrew Moylan Mar 14 '11 at 05:44
@Andrew: If only I could claim that was the reason I used `Log`... When I supplied the answer my original motivation was just to turn the product into a sum, I hadn't even thought about machine vs arbitrary precision arithmetic. – Simon Mar 14 '11 at 06:16
2

ProdTotal[ ] is scheduled to appear in version 8.π – Dr. belisarius Mar 14 '11 at 06:34
@belisarius 8.π ? If that is mathematician's humor, it's over my head. – Mr.Wizard Mar 14 '11 at 07:15
1

@Mr. Ohh well, my sense of humor is asymptotically approaching idiocy as years go by. – Dr. belisarius Mar 14 '11 at 07:20
@belisarius, no, _that_ was funny. :-D – Mr.Wizard Mar 14 '11 at 07:24
+1, turning multiplication into addition. simple and effective. – rcollyer Mar 14 '11 at 13:10
By the way, Product[i, {i,lst}] does not fail for me, it is merely slow. – Mr.Wizard Mar 15 '11 at 07:00
@Mr.Wizard: What values did you use? My Mma-V8 only fails for large lists that yield a non-machine valued result... My Mma-V7 install works fine. Eg compare `Product[i, {i, RandomReal[1, 50000]}]` with `Product[i, {i, RandomReal[{.99, 1.01}, 50000]}]` and `Product[i, {i, RandomReal[10, 50000]}]` - only the middle one works for me. – Simon Mar 15 '11 at 07:32
@Mr.Wizard: OK, I've submitted a bug report. WRI can sort out what's going on! – Simon Mar 15 '11 at 07:54

score 4 · Answer 2 · answered Mar 14 '11 at 05:30

First, to avoid confusion, take a look at an example whose results are all representable as hardware machine precision numbers, which must all be less than

In[1]:= $MaxMachineNumber

Out[1]= 1.79769*10^308

Your Total example already had this nice (and fast) property. Here is a variant on your Times example using machine numbers:

In[2]:= lst = RandomReal[{0.99, 1.01}, 5000000];
Times @@ lst // Timing

Out[3]= {1.435, 1.38851*10^-38}

Now we can use Compile to make a compiled function to perform this operation efficiently:

In[4]:= listproduct = 
 Compile[{{l, _Real, 1}}, 
  Module[{tot = 1.}, Do[tot *= x, {x, l}]; tot]]

Out[4]= CompiledFunction[{l},Module[{tot=1.},Do[tot*=x,{x,l}];tot],-CompiledCode-]

It's much faster:

In[5]:= listproduct[lst] // Timing

Out[5]= {0.141, 1.38851*10^-38}

Assuming you have a C compiler and Mathematica 8, you can also automatically compile all the way to C code. A temporary DLL is created and linked back into Mathematica at run-time.

In[6]:= compiledlistproduct = 
 Compile[{{l, _Real, 1}}, 
  Module[{tot = 1.}, Do[tot *= x, {x, l}]; tot], 
  CompilationTarget -> "C"]

Out[6]= CompiledFunction[{l},Module[{tot=1.},Do[tot*=x,{x,l}];tot],-CompiledCode-]

This gives performance not much different to that which a built-in Mathematica function would have:

In[7]:= compiledlistproduct[lst] // Timing

Out[7]= {0.015, 1.38851*10^-38}

Note that if your product really will go beyond $MaxMachineNumber (or $MinMachineNumber), then you are better off sticking with Apply[Times, list]. The same comment applies to Total, if your results can get that big:

In[11]:= lst = RandomReal[10^305, 5000000];
Plus @@ lst // Timing

Out[12]= {1.435, 2.499873364498981*10^311}

In[13]:= lst = RandomReal[10^305, 5000000];
Total[lst] // Timing

Out[14]= {1.576, 2.500061580905602*10^311}

Andrew, not to sound ungrateful, but I was already aware of Compile, which is why I stated: "the function will need to be able to return non-machine floats to be useful." Is a faster function than `Times @@` in my original example possible, or is that just the nature of the numbers? — Mr.Wizard, Mar 14 '11 at 05:32
Ah my mistake, I missed that detail. Nothing simple jumps to mind so far that will be better than `Times @@`, and the same problem applies to Total (although you may rarely get the numbers so large there). — Andrew Moylan, Mar 14 '11 at 05:36
This is still a fine answer for anyone reading this question. — Mr.Wizard, Mar 14 '11 at 06:03
+1, for all you guys at Wolfram for the new feature of compiling to machine code and linking back in via a temp dll. Just need to get v. 8 now ... As an aside, can the DLL be saved, i.e. could I compile a package, save the dll, and reuse it later? — rcollyer, Mar 14 '11 at 13:09
Yes, you would create a DLL with LibraryGenerate (http://reference.wolfram.com/mathematica/CCodeGenerator/ref/LibraryGenerate.html), and load it again with http://reference.wolfram.com/mathematica/ref/LibraryFunctionLoad.html. Note that, just as with Compile, it is generally only the "C-like" pieces of Mathematica that will compile to fast code. — Andrew Moylan, Mar 15 '11 at 01:41

score 3 · Answer 3 · edited May 23 '17 at 12:19

Simon's method is fast, but it fails on negative values. Combining it with his answer to my other question, here is a fast solution that handles negatives. Thanks, Simon.

Function

f = (-1)^(-1 /. Rule @@@ Tally@Sign@# /. -1 -> 0) * Exp@Total@Log@Abs@# &;

Testing

lst = RandomReal[{-50, 50}, 5000000];

Times @@ lst // Timing
f@lst // Timing

lst = Developer`FromPackedArray[lst];

Times @@ lst // Timing
f@lst // Timing

{0.844, -4.42943661963*10^6323240}

{0.062, -4.4294366*10^6323240}

{0.64, -4.42943661963*10^6323240}

{0.203, -4.4294366*10^6323240}

Is there a fast product operation for PackedArrays?

3 Answers3

Function

Testing

Linked