2

Maybe let me state my situation in pseudo-C++-code first:

std:vector<double> sample(someFunctor f, double lower, double upper) {
    double t = (lower + upper)/2;
    double newval = f(t);

    if (f(upper) - newval > epsilon)
        subsample1 = sample(f, t, upper);
    if (newval - f(lower) > epsilon)
        subsample2 = sample(f, lower, t);

    return concat(subsample2, newval, subsample1);
}

where concat just, well, concats the returned vectors. Basically, I am sampling a function in a way such that between two saved function values there is only a small difference.

I am not happy with the way stated above as there seems to be quite a bit of memory allocation (allocate two subvectors, then concat those and another element) in every recursive step. That piece of code will have to run in a part of my algorithm that is critical with respect to performance. Once upper - lower is fairly small, evaluating f will not take a great amount of time.

So my questions:

  • Do you see a clever way to use the same data structure in all recursive calls and just fill the current parts of that vector? (Keeping in mind that the number of function evaluations needed is not known upfront). Thoughts on this:

    • Using a list instead of a vector. But I feel the memory overhaul is not adequate for just storing doubles.
    • Keep holes in the vector and maintain another vector saying which entries are filled. A the end of a recursive call shift the entries so that there are no holes between the subsamples and newval. But now I switch copying by shifting with additional work for the second vector - probably bad idea.
  • Do you see a way to get rid of the recursion totally? However, for correctness it is important that I use the divide-and-conquer pattern stated above. The function f makes heavy use of the upper and lower bounds and gains quite a big deal of performance by this.

Thank's for your thoughts.


As per Space_C0wb0y's request let me try to rephrase my problem. Maybe the first explanation was not very clear.

I have some function (in a mathematical sense) that I want to sample (e.g. to evaluate at certain points) in a given interval.

Suppose that interval is [0,100]. I know the functions values at 0 and at 100. Maybe that is f(0)=0 and f(100) = 40. Now I evaluate the function at the intervals midpoint, which is 50. Say, my function returns f(50)=10. As f(0)-f(50) <= 10, I do not need further samples in the interval [0,50]. However, I need further computations for the interval [50,100]. Thus, in the next (recursive) step I evaluate f(75). Now repeat the above logic recursively.

At the end I would like to (two) vectors giving me the function values with the corresponding parameters like this:

parameter  = vector(0, 50, 56.25, 62.5, 75, 100)
value      = vector(0, 10, 17.21, 25    34,  40)

I am looking for the best (as is most performant) approach to build these vectors recursively.

Hope this clarifies things.

Thilo
  • 8,827
  • 2
  • 35
  • 56
  • I do not really understand your problem. Can you give an example with input and expected output? – Björn Pollex Jun 15 '11 at 08:25
  • here "performance critical" really means time critical or space critical? Most times they're a contradiction. – Eric Z Jun 15 '11 at 08:54
  • @Space_C0wb0y: Added another approach to the question. Hopefully this helps. – Thilo Jun 15 '11 at 08:57
  • @Eric: Space is a minor concern as far as the space needed during the algorithm is concerned. The result should be as compact as possible, however. Time optimality is my major focus. – Thilo Jun 15 '11 at 08:59

3 Answers3

2

Since space is not your major concern so I'll go on using the recursion.

1. Use copy by reference instead of copy by (return) value.

2. No need to pass in functor as it's constant.

3. It could have been faster if low and high are integers. It's depends on requirements though.

    // Thanks to Space_C0wb0y, here we avoid using a global vector
    // by passing the vector as reference. It's efficient as there
    // is no copy overhead as well.        
    void sample(vector<double>& samples, double low, double high)
    {
       // You can use shift operator if they're integers.
       double mid = (low + high)/2;

       // Since they're double, you need prevent them from being too close.
       // Otherwise, you'll probably see stack overflow.
       // Consider this case:
       // f(x): x=1, 0<x<8;  x*x, x<=0 or x>=8
       // low = 1, high = 10, epsilon = 10
       if (high - low < 0.5)
       {
          samples.push_back(f(mid));
          return;
       }   

       // The order you write the recursive calls guarantees you
       // the sampling order is from left to right.
       if (f(mid) - f(low) > epsilon)
       {
          sample(samples, low, mid);
       }

       samples.push_back(f(mid));

       if (f(high) - f(mid) > epsilon)
       {
          sample(samples, mid, high);
       }   
    }
Eric Z
  • 14,327
  • 7
  • 45
  • 69
  • Thanks, Eric! One of my missing links was the simple fact that the order is kept this way. That is basically the technique I have been looking for. Just out of interest: How would you have removed the recursion? – Thilo Jun 15 '11 at 09:42
  • The basic idea is to detect the largest gradient of f(x). This can be done using divide-and-conquer with a while loop. Based on that you can calculate what's the possible "step" of sampling. Afterward, you can iterate all the samples "step by step". – Eric Z Jun 15 '11 at 09:50
  • @Eric: I like the idea how you keep the order, but I do not like the idea of using a global vector. [Globals should usually be avoided](http://stackoverflow.com/questions/484635/). – Björn Pollex Jun 15 '11 at 09:52
  • @Eric: Ok, that would be an option in the case I described. However, the maths behind f force me to use divide-and-conquer for function evaluation. Using step-by-step I would loose the monotonicity of f, which would kill the whole algorithm :) – Thilo Jun 15 '11 at 10:06
  • @Space_C0wb0y: Yes, usually it is, especially on windows where writable globals will be copied to paging file. But if your software is not large-scale and well designed, e.g., thread safe, it's not that bad. It's more of a design issue than of a performance issue. – Eric Z Jun 15 '11 at 10:10
  • @Thilo: Pay attention to the floating number algorithms. If not careful, you can easily get "stack overflow" though. I've updated my algorithm. – Eric Z Jun 15 '11 at 10:12
  • @Space_C0wb0y: In this case, I can make a wrapper to avoid the use of a global vector anyway:) – Eric Z Jun 15 '11 at 10:18
  • @Eric: Thanks for the floating point input. Due to the nature of f I saw no problem there - from theory I have a few guarantees helping me out on that one. – Thilo Jun 16 '11 at 07:41
1

I would recommend the following approach:

  1. Don't use two vectors, but rather one vector with pairs or a custom struct to represent parameters and values:

    struct eval_point {
        double parameter;
        double value;
    };
    
    std::vector<eval_point> evaluated_points;
    
  2. Change your algorithm to write the results of the evaluations to an output iterator:

    template<class F, class output_iterator_type>
    void sample(F someFunctor, double lower, double upper,
                output_iterator_type out) {
        double t = (lower + upper)/2;
        eval_point point = { t, f(t) };
    
        if (f(upper) - point.value > epsilon) {
            *out = point;
            ++out;
            sample(f, t, upper, out);
        }
        if (point.value - f(lower) > epsilon) {
            *out = point;
            ++out;
            subsample2 = sample(f, lower, t, out);
        }
    }
    

    The above is a modification of your pseudo-code showing what it might look like when using an output iterator. It is not tested, so I am not sure if it is correct. In principle, you would call it like this:

    std::vector<eval_point> results;
    sample(someFunction, 0, 100, std::back_inserter<eval_point>(results));
    

    This way you will not have to create new vectors for each recursive call. If you can guess a reasonable lower bound for the number of samples, you might be able to preallocate such that no reallocations are required. In that case you would call it like this:

    std::vector<eval_point> results(lower_bound_for_samples);
    sample(someFunction, 0, 100, results.begin());
    

You would then have to add an extra counter to keep track of how many samples were generated.

Björn Pollex
  • 75,346
  • 28
  • 201
  • 283
  • Thanks for your input. Certainly an intriguing idea. This would mean to give up the ordering I assumed for the result vector(s). This, of course, could be solved by simply sorting the result afterwards. – Thilo Jun 15 '11 at 09:27
  • Ok, the order is no problem using Eric's idea, which could be applied here as well. In the end it will probably be something between this and Eric's solution. – Thilo Jun 15 '11 at 09:44
  • @Thilo: Eric's approach about keeping the ordering is good, but please do not follow his advice to use a global vector to store the results (see my comment on his answer). – Björn Pollex Jun 15 '11 at 09:54
  • @Space_C0wb0y: I wasn't going to use globals. Currently I am thinking about making a "sampling" class that encapsulates all that logic and a bit more. This would fit nicely in my design and allow me to use a member variable for that vector and also the information needed for functor. Any objections? – Thilo Jun 15 '11 at 10:02
  • @Thilo: That sound fine. Just making sure :) – Björn Pollex Jun 15 '11 at 10:10
  • Tough decision which answer to accept. I chose Erics, as his point about the sorting really helped me out, however, I also greatly appreciated your help! – Thilo Jun 16 '11 at 07:42
0

I don't see why you decline a list solution. The worst case would be that your list has 3 times the size than your raw data. I think that is by far less than what you have when you create a new vector on each function-call. You should try it out as it does not require that much change, because the interface of both is nearly the same.

  • I do not totally decline the list. However, both ideas I had require quite a bit of memory allocations during the algorithm - which means quite a lot of work to do. The solution by Space_C0wb0y does not require additional mallocs. – Thilo Jun 15 '11 at 09:30