3

In java, multiple threads can wait all others at a certain point so that they don't start a new block of codes before all others finish first block:

CyclicBarrier barrier = new CyclicBarrier(2);

// thread 1
readA();
writeB();
barrier.await();
readB();
writeA();

// thread 2
readA();
writeB();
barrier.await();
readB();
writeA();

is there an exact or easy convertion to C++?

Also with OpenCL, there is a similar instruction:

readA();
writeB();
barrier(CLK_GLOBAL_MEM_FENCE);
readB();
writeA();

so all neighbor threads wait each other but it is only a constrained C implementation.

huseyin tugrul buyukisik
  • 11,469
  • 4
  • 45
  • 97
  • 1
    Drawing parallels between two unrelated languages such as Java and C++ can prove to be counterproductive. In fact with C++ you should forget everything you learned in Java, start with a clean sheet and make your way up. – Ron Jan 13 '18 at 15:38
  • @Ron do you mean something like having a separate thread to watch the atomic counter and inform all threads from that thread when counter reaches N? Then there is no simple command for C++. – huseyin tugrul buyukisik Jan 13 '18 at 15:40
  • There are various strategies when dealing with [threads in C++](http://en.cppreference.com/w/cpp/thread). – Ron Jan 13 '18 at 15:42
  • 2
    Such a thing doesn't exist in the C++ standard library. You can implement it from available synchronization primitives (looks like a mutex, a counter, and a condition variable would do the trick), or perhaps look for existing third-party implementation. – Igor Tandetnik Jan 13 '18 at 15:42
  • @IgorTandetnik N threads looking at same counter could be slow can it? Should I add a tree-like mechanism that makes maximum 2-3 threads for each partial counter and a separate thread for checking all partial counters against total counter? – huseyin tugrul buyukisik Jan 13 '18 at 15:46
  • @VTT no they use same cyclic barrier instance. I don't know whats inside. – huseyin tugrul buyukisik Jan 13 '18 at 15:49
  • 1
    The expense of locking a mutex would dwarf any counter manipulation by orders of magnitude. – Igor Tandetnik Jan 13 '18 at 15:51
  • @IgorTandetnik I don't have a 256 thread CPU but what if I had? Would it matter if 256 threads watch single counter to continue? (such as xeon-phi x200 with somewhat low single-thread performance) – huseyin tugrul buyukisik Jan 13 '18 at 15:58
  • Barrier: https://pastebin.com/PCWHXZxR – Brandon Jan 13 '18 at 16:04
  • @Brandon thank you. The conditional wait looks simpler. – huseyin tugrul buyukisik Jan 13 '18 at 16:08

3 Answers3

3

C++ STL doesn't have a Cyclic Barrier. You may propose one to the standards committee :)

A company like Oracle or Microsoft can quickly decide what to add to their language's library. For C++, people have to come to an agreement, and it can take a while.

256 threads is a lot. As with all performance-related questions, you need to measure the code to make an informed decision. With 256 threads I would be tempted to use 10 barriers that are synchronized by an 11th barrier. You need to measure to know if that's actually better.

Check out my C++ implementation of a cyclic barrier, inspired by Java. I wrote it a couple years ago. It's based it off of someone else's (buggy) code I found at http://studenti.ing.unipi.it/~s470694/a-cyclic-thread-barrier/ (link doesn't work anymore...) The code is really simple (no need to credit me). Of course, it's as is, no warranties.

// Modeled after the java cyclic barrier.
// Allows n threads to synchronize.
// Call Break() and join your threads before this object goes out of scope
#pragma once

#include <mutex>
#include <condition_variable>


class CyclicBarrier
{
public:
    explicit CyclicBarrier(unsigned numThreads)
        : m_numThreads(numThreads)
        , m_counts{ 0, 0 }
        , m_index(0)
        , m_disabled(false)
    { }

    CyclicBarrier(const CyclicBarrier&) = delete;
    CyclicBarrier(CyclicBarrier &&) = delete;
    CyclicBarrier & operator=(const CyclicBarrier&) = delete;
    CyclicBarrier & operator=(CyclicBarrier &&) = delete;

    // sync point
    void Await()
    {
        std::unique_lock<std::mutex> lock(m_requestsLock);
        if (m_disabled)
            return;

        unsigned currentIndex = m_index;
        ++m_counts[currentIndex];

        // "spurious wakeup" means this thread could wake up even if no one called m_condition.notify!
        if (m_counts[currentIndex] < m_numThreads)
        {
            while (m_counts[currentIndex] < m_numThreads)
                m_condition.wait(lock);
        }
        else
        {
            m_index ^= 1; // flip index
            m_counts[m_index] = 0;
            m_condition.notify_all();
        }
    }

    // Call this to free current sleeping threads and prevent any further awaits.
    // After calling this, the object is no longer usable.
    void Break()
    {
        std::unique_lock<std::mutex> lock(m_requestsLock);
        m_disabled = true;
        m_counts[0] = m_numThreads;
        m_counts[1] = m_numThreads;
        m_condition.notify_all();
    }

private:
    std::mutex     m_requestsLock;
    std::condition_variable m_condition;
    const unsigned m_numThreads;
    unsigned       m_counts[2];
    unsigned       m_index;
    bool           m_disabled;
};
Humphrey Winnebago
  • 1,512
  • 8
  • 15
2
  • C++20 has std::barrier now.

  • POSIX has "pthread_barrier_t" with the following interface:

     int pthread_barrier_init(pthread_barrier_t *restrict barrier, 
           const pthread_barrierattr_t *restrict attr, unsigned count); 
    
     int pthread_barrier_wait(pthread_barrier_t *barrier);
     int pthread_barrier_destroy(pthread_barrier_t *barrier);
    
jannarc
  • 93
  • 2
  • 8
1

You can find it in boost library, it's called just barrier and lacks a wait timeout option.

Sasha Yakobchuk
  • 471
  • 6
  • 12