0

The C++ DoWork1 routine below calculates the total of every number between 0 and 1000. Is there any way we can dynamically create 4 versions of this with each version counting a different quarter range i.e. 0-(250-1), 250-(500-1), 500-(750-1) and 750-1000 ?

void DoWork1() {
    int n1 = 0;
    int p1 = 0;
    for (int p1 = 0; p1 <= 1000; p1++)
        n1 = n1 + p1;
}

This is in preparation for an introduction to thread work with the ultimate aim of dynamically splitting the above routine into the max number of available threads to achieve the fastest execution time. I'm trying to keep things as simple as possible hence this basic routine as a starter. I understand there is potential for concurrency issues if using the same variables - I can look more into this once I see if there's a way to accomplish the above (unless the concurrency issue can be relatively easily overcome in this example).

Coral Kashri
  • 3,436
  • 2
  • 10
  • 22
Jim
  • 423
  • 1
  • 4
  • 13
  • Side note: The fastest execution time for the stated task would be to forget threads and looping and apply Gauss's summation trick. – user4581301 Jul 10 '20 at 23:14
  • 1
    To get what you want, have the function take a range and sum the into a variable that is not shared with other threads, then sum the sums once the threads are done. If you try to share the summation variable you'll either get garbage as the threads stomp over one another's results or find you have worse than serial results because the threads are constantly blocking to get access to the shared variable. – user4581301 Jul 10 '20 at 23:18
  • Many thanks for the reply. I'm more looking into seeing if this routine could be dynamically split - I don't know if C++ can do this...maybe fit into a dynamic lambda? but I'm lacking in experience here. I'm very interested in thread work and was hoping this would be a nice simple minimal suitable for working on. – Jim Jul 10 '20 at 23:18
  • You can do it dynamically. You'll have to provide the code that does the task division, though. – user4581301 Jul 10 '20 at 23:20
  • 1
    Note: don't try and tack on concurrency protection after the fact. If you don't bake it in from the beginning you'll never get the code off the ground. – user4581301 Jul 10 '20 at 23:21
  • Ah yes....DoWork1(range) !! - thank you, that'll give me something to work with. Cheers. – Jim Jul 10 '20 at 23:22

1 Answers1

4

Here's a simple solution based on std::thread. No doubt you can adapt this to your needs:

#include <iostream>
#include <thread>
#include <functional>

void sum (int from, int to, int &total_out)
{
    int total = 0;
    for (int i = from; i < to; ++i)
        total += i;
    total_out = total;
}

int main() {
    int sum1, sum2, sum3, sum4;
    std::thread t1 (sum, 0, 250, std::ref (sum1));
    std::thread t2 (sum, 250, 500, std::ref (sum2));
    std::thread t3 (sum, 500, 750, std::ref (sum3));
    std::thread t4 (sum, 750, 1000, std::ref (sum4));
    t1.join ();
    t2.join ();
    t3.join ();
    t4.join ();
    std::cout << sum1 << ", " << sum2 << ", " << sum3 << ", " << sum4;
}

An alternative mechanism to return the result of a thread is to use std::future, as discussed here.

Paul Sanders
  • 24,133
  • 4
  • 26
  • 48
  • Huge thanks for this Paul and the valuable link. I really like the way external references to sum1 - sum4 are used to avoid concurrency problems with the running total. This will give the old grey matter something to work on as I look to do the same with the thread creation as you did with the sum routine in order to dynamically expand the number of threads created (and ranges) according to the number of threads available. – Jim Jul 11 '20 at 00:02
  • 1
    @jim If you don't know how many threads you'll have `std::vector` and `std::vector` for the sums. HUGE CAVEAT: allocate storage in the `vector`s first. If you `push_back` and the `vector`s resize, the references to the sums will be invalidated and the threads will be writing sums to invalid memory. Small caveat: Usually there isn't much point to having more threads than processor cores. Instead of a variable number of threads, the number of threads is usually fixed to the number of cores--a thread pool--and jobs are sent to the thread pool to organize. – user4581301 Jul 11 '20 at 15:53