0

Can iterating over unsorted data structure like array, tree with multiple thread make it faster?

For example I have big array with unsorted data.

int array[1000];

I'm searching array[i] == 8 Can running:

Thread 1:

for(auto i = 0; i < 500; i++)
{
  if(array[i] == 8)
      std::cout << "found" << std::endl;
}

Thread 2:

for(auto i = 500; i < 1000; i++)
{
  if(array[i] == 8)
      std::cout << "found" << std::endl;
}

be faster than normal iteration?

@update

I've written simple test witch describe problem better: For searching int* array = new int[100000000]; and repeating it 1000 times I got the result:

a
Number of threads = 2
End of multithread iteration
End of normal iteration
Time with 2 threads 73581
Time with 1 thread 154070
Bool values:0
0
0

Process returned 0 (0x0)   execution time : 256.216 s
Press any key to continue.

What's more when program was running with 2 threads cpu usage of the process was around ~90% and when iterating with 1 thread it was never more than 50%.

So Smeeheey and erip are right that it can make iteration faster. Of course it can be more tricky for not such trivial problems.

And as I've learned from this test is that compiler can optimize main thread (when i was not showing boolean storing results of search loop in main thread was ignored) but it will not do that for other threads.

This is code I have used:

#include<cstdlib>
#include<thread>
#include<ctime>
#include<iostream>
#define SIZE_OF_ARRAY 100000000
#define REPEAT 1000
inline bool threadSearch(int* array){
    for(auto i = 0; i < SIZE_OF_ARRAY/2; i++)
        if(array[i] == 101) // there is no array[i]==101
            return true;
        return false;
}
int main(){
    int i;
    std::cin >> i; // stops program enabling to set real time priority of the process
    clock_t with_multi_thread;
    clock_t normal;
    srand(time(NULL));
    std::cout << "Number of threads = "
              <<  std::thread::hardware_concurrency() << std::endl;
    int* array = new int[SIZE_OF_ARRAY];
    bool true_if_found_t1 =false;
    bool true_if_found_t2 =false;
    bool true_if_found_normal =false;
    for(auto i = 0; i < SIZE_OF_ARRAY; i++)
        array[i] = rand()%100;
    with_multi_thread=clock();
    for(auto j=0; j<REPEAT; j++){
        std::thread t([&](){
            if(threadSearch(array))
                true_if_found_t1=true;
        });
        std::thread u([&](){
            if(threadSearch(array+SIZE_OF_ARRAY/2))
                true_if_found_t2=true;
        });
        if(t.joinable())
            t.join();
        if(u.joinable())
            u.join();

    }
    with_multi_thread=(clock()-with_multi_thread);
    std::cout << "End of multithread iteration" << std::endl;
    for(auto i = 0; i < SIZE_OF_ARRAY; i++)
        array[i] = rand()%100;
    normal=clock();
    for(auto j=0; j<REPEAT; j++)
        for(auto i = 0; i < SIZE_OF_ARRAY; i++)
            if(array[i] == 101) // there is no array[i]==101
                true_if_found_normal=true;
    normal=(clock()-normal);
    std::cout << "End of normal iteration" << std::endl;
    std::cout << "Time with 2 threads " << with_multi_thread<<std::endl;
    std::cout << "Time with 1 thread " << normal<<std::endl;
    std::cout << "Bool values:" << true_if_found_t1<<std::endl
               << true_if_found_t2<<std::endl
               <<true_if_found_normal<<std::endl;// showing bool values to prevent compiler from optimization
    return 0;
}
Foto Blysk
  • 344
  • 2
  • 11

1 Answers1

3

The answer is yes, it can make it faster - but not necessarily. In your case, when you're iterating over pretty small arrays, it is likely that the overhead of launching a new thread will be much higher than the benefit gained. If you array was much bigger then this would be reduced as a proportion of the overall runtime and eventually become worth it. Note you will only get speed up if your system has more than 1 physical core available to it.

Additionally, you should note that whilst that the code that reads the array in your case is perfectly thread-safe, writing to std::cout is not (you will get very strange looking output if your try this). Instead perhaps your thread should do something like return an integer type indicating the number of instances found.

Smeeheey
  • 9,906
  • 23
  • 39
  • `printf` is thread safe, which is why you'll tend to see it more often when debugging parallel C++ code. – erip May 04 '16 at 11:34
  • 1
    You could also use a lock/mutex to lock up `std::cout` while it's being used by a thread. And what could get weird with multi-thread iteration is if you decide to modify the array. You need to _make sure_ that you handle which thread has access to a certain part of an array. – Charles May 05 '16 at 00:22