2

I have a main.cpp file where I have the following code

paral(start, end, [&](int i){
    C1[i] = A1[i] + B[i];
}, numThreads);`

I have paral definition in otherfile.cpp, where I have the following code

void paral(int start, int end, T &&lambda, int nT){
    pthread_t thread1;
    int status = pthread_create(&thread1, NULL,lambda ,1);//Should execute lambda(1)
    lambda(2);//Executed by main thread
    //Code for join and so on
}

It says:

can't covert lambda to (void*)(*)(void *).

I tried to cast and pass the lambda function to pthread but it wasn't helpful. I want to invoke the lambda function from the created thread as well which I'm unable to do.

Shawn
  • 47,241
  • 3
  • 26
  • 60
srccode
  • 721
  • 4
  • 16
  • 4
    Why not use C++11 thread support library? – P0W Jan 07 '19 at 06:37
  • 1
    As the man-page for pthread_create states, the function that you give it (lambda) should take one void-pointer as parameter (not an int as in your case) and the value of this parameter is given as the fourth parameter to pthread_create (your second NULL). – Gerriet Jan 07 '19 at 06:37
  • yeah edited the fourth param to 1 as I need to calculte lambda(1) – srccode Jan 07 '19 at 06:40
  • @P0W Should be done using pthread library only – srccode Jan 07 '19 at 06:44
  • 1
    Possible duplicate of [Passing capturing lambda as function pointer](https://stackoverflow.com/questions/28746744/passing-capturing-lambda-as-function-pointer) – druckermanly Jan 07 '19 at 06:47
  • A capturing lambda can't be converted to a function pointer. If you must use pthreads, you also must restructure your code to pass the necessary data to the thread function. – molbdnilo Jan 07 '19 at 08:17

1 Answers1

2

First of all you must understand that Lambda is an object, that implements operator()(args...) member function. In your specific case it is operator()(int i).

In order to execute this lambda two parameters must be passed to the operator()(int):

  • Lambda pointer (this)
  • integer

Address of the lambda is an address to object (i.e. data) rather than address of code.

Thread start function instead is a function that accepts void* and returns void*. Address of function is an address to machine code.

Therefore, to execute your lambda you should define void* (void*) function and pass its address as start_routine parameter. The address of lambda you pass as arg parameter:

template<typename Lambda>
void paral(int start, int end, Lambda&& lambda, int nT){

    struct Args
    {
        int Start;
        int End;
        Lambda& Func;
    };

    // create captureless lambda
    auto threadStart = +[](void* voidArgs) -> void* 
    {
        auto& args = *static_cast<Args*>(voidArgs);

        for(int i = args.Start; i < args.End; ++i)
            args.Func(i);
              
        return nullptr;
    };

    // I create one thread here. You will create more.    
    auto args = Args{start, end, lambda};
    pthread_t handle;
    int rc = pthread_create(&handle, NULL, threadStart, &args);

    if(rc)
        throw std::system_error(
            std::error_code(rc, std::generic_category()), 
            "pthread_create");

    pthread_join(handle, nullptr);
}

However in this specific case, you better to use std::thread instead of pthread library. In such case you code may look like following:

#include <iostream>

#include <atomic>
#include <thread>
#include <vector>

template<typename Func>
void paral(int start, 
           int end, 
           Func &&func, 
           int threads_count = std::thread::hardware_concurrency())
{
    std::atomic_int counter {start};
    std::vector<std::thread> workers;
    workers.reserve(threads_count);

    for(int i = 0; i < threads_count; ++i) {
        workers.emplace_back([end, &counter, &func] {
            for(int val = counter++; val < end; val = counter++)
                func(val);
        });
    }

    for(int i = 0; i < threads_count; ++i)
        workers[i].join();
}    

int main() {
    int C1[10];
    int A[10] = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10};
    int B[10] = {11, 12, 13, 14, 15, 16, 17, 18, 19, 110};
    paral(0, 10, [&](int i){ C1[i] = A[i] + B[i]; });

    for(auto&& v: C1)
        std::cout << v << "\n";
    std::cout << "Done. Bye!" << std::endl;
}

There is important note though. Your code may may work not as fast as you may expect. It will experience the false sharing problem as several threads modify the memory of the same cache line, which will force CPU cores to update their L1 caches every time when memory is updated by another CPU core.

See also: