7

Often I have written: {Min@#, Max@#} &

Yet this seems inefficient, as the expression must be scanned twice, once to find the minimum value, and once to find the maximum value. Is there a faster way to do this? The expression is often a tensor or array.

Mr.Wizard
  • 24,179
  • 5
  • 44
  • 125
  • @Mr. I wonder if you care to add an edit to your questions with a plot of the timings for each answer in Mma 7. Post the code and then I'll run it in Mma 8 to add that results too. Agree? – Dr. belisarius May 04 '11 at 14:41
  • @belisarius I am rather sleepy at the moment, but I don't understand. – Mr.Wizard May 04 '11 at 14:43
  • @Mr. Don't worry, me too. Perhaps just a silly thing .. – Dr. belisarius May 04 '11 at 14:56
  • @belisarius, I added timings to my answer below. – rcollyer May 04 '11 at 17:35
  • 1
    @Mr. You probably already know about [this](http://bmia.bmt.tue.nl/Software/Mathematica/Tutorials/TipsTricks/FastMaxMinPosition.html). I haven't checked it out, just stumbled across the site. – 681234 May 17 '11 at 08:58
  • @TomD I am, but I haven't seen that site before, and it is useful information for anyone finding this question with a search. Thanks. – Mr.Wizard May 17 '11 at 09:14

4 Answers4

5

This beats it by a bit.

minMax = Compile[{{list, _Integer, 1}},
   Module[{currentMin, currentMax},
    currentMin = currentMax = First[list];
    Do[
     Which[
      x < currentMin, currentMin = x,
      x > currentMax, currentMax = x],
     {x, list}];
    {currentMin, currentMax}],
   CompilationTarget -> "C",
   RuntimeOptions -> "Speed"];
v = RandomInteger[{0, 1000000000}, {10000000}];
minMax[v] // Timing

I think it's a little faster than Leonid's version because Do is a bit faster than For, even in compiled code.

Ultimately, though, this is an example of the kind of performance hit you take when using a high level, functional programming language.

Addition in response to Leonid:

I don't think that the algorithm can account for all the time difference. Much more often than not, I think, both tests will be applied anyway. The difference between Do and For is measurable, however.

cf1 = Compile[{{list, _Real, 1}},
   Module[{sum},
    sum = 0.0;
    Do[sum = sum + list[[i]]^2,
     {i, Length[list]}];
    sum]];
cf2 = Compile[{{list, _Real, 1}},
   Module[{sum, i},
    sum = 0.0;
    For[i = 1, i <= Length[list],
     i = i + 1, sum = sum + list[[i]]^2];
    sum]];
v = RandomReal[{0, 1}, {10000000}];
First /@{Timing[cf1[v]], Timing[cf2[v]]}

{0.685562, 0.898232}
Mark McClure
  • 4,862
  • 21
  • 34
  • Pretty cool! It is about twice faster than my fastest solution. The main reason though is that you were able to reduce the number or tests about twice, since you actually use better algorithm - you use explicitly the fact that at any given element, only one of the two tests can be effective - +1. – Leonid Shifrin May 04 '11 at 13:30
  • I may have to start specifying "For Mathematica 7" in all my posts. I will leave it to others to vote for this, because I cannot test it. That doesn't mean I am not grateful for your efforts. – Mr.Wizard May 04 '11 at 13:32
  • Mark, congratulations on 1k rep. Can you direct me to information on *why* this kind of thing is not or can not be optimized at run time? – Mr.Wizard May 04 '11 at 13:37
  • @Mr.Wizard I think that for JIT to be generally effective, Mathematica would have to be more strongly typed. Certain JIT-like features like auto-compilation are present in mma though. – Leonid Shifrin May 04 '11 at 14:09
  • @Mr.Wizard Even in Mathematica 7 you can still use `Compile` with Mark's code, just drop all the options. – Sasha May 04 '11 at 14:31
  • 1
    @Sasha If you do that in M7, Mark's code is 10 times slower than the Min-Max code, so this is hardly an option. – Leonid Shifrin May 04 '11 at 14:37
  • @Mark, the whole fact that you use `Which` means that you skip the second case entirely every time when the first case condition is satisfied. My code always tests the second condition. It indeed can not account for all the difference, but I still think that your algorithm is substantially better. Of course, the difference between `Do` and `For` is significant, no question about it. – Leonid Shifrin May 04 '11 at 16:24
  • 1
    @Mr. Wizard You certainly can call `Max` and `Min` in compiled functions and this is nice, if you have procedural code that uses `Max` or `Min` that needs to be optimized. There's generally no reason to `Compile` a call to a single built in function, though, since that function is likely already optimized. Also, optimization will not fundamentally alter your algorithm so your original code will still take twice as long as a single call to `Max` or `Min`. – Mark McClure May 04 '11 at 16:27
  • I am giving this the check mark, even though I cannot use it, because I originally forgot to specify "for Mathematica 7" in my question. – Mr.Wizard May 10 '11 at 11:13
3

I think this is as fast as it gets, within Mathematica programming practices. The only way I see to attempt to make it faster within mma is to use Compile with the C compilation target, as follows:

getMinMax = 
 Compile[{{lst, _Real, 1}},
   Module[{i = 1, min = 0., max = 0.},
       For[i = 1, i <= Length[lst], i++,
        If[min > lst[[i]], min = lst[[i]]];
        If[max < lst[[i]], max = lst[[i]]];];
        {min, max}], CompilationTarget -> "C", RuntimeOptions -> "Speed"]

However, even this seems to be somewhat slower than your code:

In[61]:= tst = RandomReal[{-10^7,10^7},10^6];

In[62]:= Do[getMinMax[tst],{100}]//Timing
Out[62]= {0.547,Null}

In[63]:= Do[{Min@#,Max@#}&[tst],{100}]//Timing
Out[63]= {0.484,Null}

You probably can write the function entirely in C, and then compile and load as dll - you may eliminate some overhead this way, but I doubt that you will win more than a few percents - not worth the effort IMO.

EDIT

It is interesting that you may significantly increase the speed of the compiled solution with manual common subexpression elimination (lst[[i]] here):

getMinMax = 
 Compile[{{lst, _Real, 1}},
   Module[{i = 1, min = 0., max = 0., temp},
     While[i <= Length[lst],
       temp = lst[[i++]];
       If[min > temp, min = temp];
       If[max < temp, max = temp];];
       {min, max}], CompilationTarget -> "C", RuntimeOptions -> "Speed"]

is a little faster than {Min@#,Max@#}.

Leonid Shifrin
  • 22,449
  • 4
  • 68
  • 100
  • Especially not worth the effort if you use Mma7! ;-p Leonid, you surely understand the mechanics of Mathematica better than I do. Can you (in a few words...) tell me why something like `{Min@#, Max@#}&` cannot be internally "JIT" optimized? That is, why Mathematica is not or cannot be built this way? – Mr.Wizard May 04 '11 at 13:06
  • 1
    I think such optimizations are hard to make generally reliable, for more complex functions, symbolic arguments, etc. Also, it is somewhat orthogonal to the idiomatic mma programming, since by doing so, you break the high-level thinking process (declarative programming style) by specifying details of how this should be computed. I think that, unless you gain an order of magnitude speed-up or so, such optimizations do not pay off since you lose the advantage of high-level thinking (abstractions, readability, screen space), which is often more important, especially for larger projects. – Leonid Shifrin May 04 '11 at 13:14
3

For an array, you could do the simplest functional thing and use

Fold[{Min[##],Max[##]}&, First@#, Rest@#]& @ data

Unfortunately, it is no speed demon. Even for short lists, 5 elements, all of Leonid's answers and Mark's answer are at least 7 times faster, uncompiled in v.7. For long lists, 25000 elements, this gets worse with Mark's being 19.6 times faster, yet even at this length this solution took only about 0.05 secs to run.

However, I'd not count out {Min[#], Max[#]}& as an option. Uncompiled it was 1.7 times faster than Mark's for short list and nearly 15 times faster for long lists (8 times and nearly 300 times faster, respectively, than the Fold solution).

Unfortunately, I could not get good numbers for the compiled versions of either {Min[#], Max[#]}&, Leonid's, or Mark's answers, instead I got numerous incomprehensible error messages. In fact, {Min[#], Max[#]}& increased in execution time. The Fold solution improved dramatically, though, and took twice as long as Leonid's answers' uncompiled times.

Edit: for the curious, here's some timing measurements of the uncompiled functions -

timing measurements on a log-log plot

Each function was used on 100 lists of the length specified on the horizontal axis and the average time, in seconds, is the vertical axis. In ascending order of time, the curves are {Min[#], Max[#]}&, Mark's answer, Leonid's second answer, Leonid's first answer, and the Fold method from above.

Community
  • 1
  • 1
rcollyer
  • 10,475
  • 4
  • 48
  • 75
2

For all of you who are doing timings I'd like to warn you that order of execution is extremely important. For instance, look at the following two subtly different timings tests:

(1)

res =
 Table[
  a = RandomReal[{0, 100}, 10^8];
  {
   Min[a] // AbsoluteTiming // First, Max[a] // AbsoluteTiming // First,
   Max[a] // AbsoluteTiming // First, Min[a] // AbsoluteTiming // First
   }
  , {100}
  ]

enter image description here
The odd man out here is the last Min

(2)

res =
 Table[
  a = RandomReal[{0, 100}, 10^8];
  {
   Max[a] // AbsoluteTiming // First, Min[a] // AbsoluteTiming // First,
   Min[a] // AbsoluteTiming // First, Max[a] // AbsoluteTiming // First
   }
  , {100}
  ]

enter image description here

Here, the highest timing is found for the first Max, the second-highest for the second max and the two Mins are about the same and lowest. Actually, I'd expect Max and Min to take about the same time,but they don't. The former seems to take 50% more time than the latter. Having the pole position also seems to come with a 50% handicap.

Now a comparison with the algorithms given by Mark and Leonid:

res =
 Table[
  a = RandomReal[{0, 100}, 10^8];
  {
   {Max[a], Min[a]} // AbsoluteTiming // First,
   {Min@#, Max@#} &@a // AbsoluteTiming // First,
   getMinMax[a] // AbsoluteTiming // First,
   minMax[a] // AbsoluteTiming // First,
   {Min[a], Max[a]} // AbsoluteTiming // First
   }
  , {100}
  ]

enter image description here

Here we find about .3 s for the {Max[a], Min[a]} (which includes the pole position handicap), the .1 level is for Mark's method; the others are all about the same.

Sjoerd C. de Vries
  • 16,122
  • 3
  • 42
  • 94