Sync with atomic variable

Question

I am trying a very basic test with the atomic variable in c++:

this is a basic header:

#pragma once
#include<atomic>


class spinlock_mutex {
  std::atomic_flag flag;

public:
  spinlock_mutex() : flag(ATOMIC_FLAG_INIT) {}
  void lock() {
    while (flag.test_and_set())
      ;
  }
  void unlock() { flag.clear(); }
};

Here is a very basic implementation of a queue which is on purpose oversized to contain all the samples:

template <class T, uint32_t MAX_SIZE = UCHAR_MAX>
class SingleQueue {
protected:
  T _q[MAX_SIZE]{};
  uint16_t _back {};
  int16_t _front = 0;
  uint16_t _count = 0;
  spinlock_mutex mtx{};

public:
  SingleQueue() = default;

  void push(T &&val) {
    mtx.lock();
    _q[_count] = std::move(val);
    _count++;
    
    mtx.unlock();
  }
  void push(T &val) {
    mtx.lock();
   
    _q[_count] = val;
   
    _count++;
   
    mtx.unlock();
  }

  void pop(T &val) {
    mtx.lock();
    val = _q[_count];
    _back++;
   
    mtx.unlock();
  }

And this is my source code:

class DummyComm {

public:
  int l_val1{};
  int l_val2{};
};

cu::SingleQueue<DummyComm,200000> q{};
DummyComm val{};
spinlock_mutex spin {};
void producer_m() {

  uint16_t count = 0;
  
  for (int32_t idx = 0; idx < 100000; idx++) {
    std::cout<<"Tx: "<<idx<<"\n";

    q.push(DummyComm{idx,idx});

    
  }
}
void consumer_m() {
  std::vector<DummyComm> vec{};
  DummyComm res {};

  while(true)
  {
     q.pop(res);
     std::cout<<"Rx: "<<res.l_val1<<"\n";
     vec.push_back(res);
  
     if(vec.back().l_val1==99999)
     {
        break;
     }
  }
  std::cout<<vec.back().l_val1<<std::endl;

}

TEST(testOne, basic) {
  std::thread prod{(producer_m)};
  std::thread cons{(consumer_m)};
  prod.join(), cons.join();
}

I can't manage to store all the values I need in the array, it seems the reader is slower than the writer.

If I don't use an array but a single element to store the data, everything seems to go fine.

My understanding is that atomic variable should be updated at least on an x86 architecture (which I am currently using with g++ 9.4 on Linux), therefore count which is in a critical section between a lock/unlock of a spinlock should be updated.

Any word of advice? Thanks!

Normally you want buffer sizes to be powers of two, like 256, not 255. Then you can modulo for nearly free. Also, what is the point of this? If you're going to use hand-rolled atomics, a lock-free queue is often a good idea. Hand-rolling your own spinlock is not very efficient, and misses out on several important things like using a spin-loop hint like x86 `_mm_pause()`, or falling back to sleeping or yielding the CPU after a few spin iterations. See [Locks around memory manipulation via inline assembly](https://stackoverflow.com/a/37246263) for discussion of those. — Peter Cordes, Sep 30 '22 at 00:57

score 1 · Answer 1 · answered Sep 30 '22 at 00:49

There are multiple logical bugs in the shown code.

    _q[_count] = std::move(val);
    _count++;

No provisions are made here to check of the queue count reached the queue buffer's maximum size. At that point memory corruption ahoy! Crash city.

uint32_t MAX_SIZE = UCHAR_MAX

The size of the queue appears to be 255.

  for (int32_t idx = 0; idx < 100000; idx++) {
    std::cout<<"Tx: "<<idx<<"\n";

    q.push(DummyComm{idx,idx});

The code will attempt to push 100000 values into the queue.

A very close inspection of the above code finds no evidence that _count ever gets decremented.

After the first 255 values get pushed, the next value gets written into _q[256]. Goodbye.

  void pop(T &val) {
    mtx.lock();
    val = _q[_count];
    _back++;
   
    mtx.unlock();
  }

This is the code that attempts to pop something from the queue. It is completely unclear if the intent here is to implement an actual queue, or a stack.

If this should be a queue:

A very careful search of the entire shown code finds no evidence of _back being used anywhere else. It seems to be that the intent here is to use _back to remove values from the trailing edge of the queue. But, instead, _count makes a mysterious appearance here. This would be the leading edge of the queue, and _q[_count] would be, according to how the rest of the code uses it, the first unused value in the queue buffer, where the next value that goes into the queue will go.
And there are no provisions made, here, to check if the queue is empty.

If this entire thing is intended to be a stack: as explained before _q[_count] is not the value at the top of the stack, anyway.

In conclusion: there are multiple fundamental problems here:

no provisions exist for checking for an empty or a full queue; even if all the other problems are fixed unless the reader and the writer are running at 100% identical speed, at some point the queue will either be empty, or full, and the shown code has absolutely no logic, at all, to deal with it
if the _q buffer is intended to be a queue of fixed size, no provisions exist for confining the queue's values to the actual _q buffer. Instead, a counter is initialized, and values get pushed into the queue, incrementing the index counter all the way into infinity. It appears that the intent here is to use a circular queue. The "circular" part of the logic is completely missing.

All of these logical bugs must be addressed in order for the queueing logic to work correctly.

Hi, thanks for the reply, I know it is bad code, I wanted just to prove I can increment an index to be honest, and put elements in an array. At the moment what you see is not really a queue as you are saying, but more like a stack. Regarding the size I set it as cu::SingleQueue q{} in the example, so at least that is very big and is bigger than 100000 samples, this was for the same of testing — HDenied, Sep 30 '22 at 06:57

Sync with atomic variable

1 Answers1