This question is a follow up to this question. The summary is -- I had a server calling close() to finish off connections, but it seemed as if the shutdown sequence never occurred. The client continued to wait for more data. close() returned 0 in the server. Switching out my thread-safe queue from a conditional wait to semaphores solved the issue even though the conditional queue is correctly implemented. I'm posting my code to see if anyone has any illumination on these things for me.
condition-based queue:
TASK *head;
pthread_mutex_t mutex;
pthread_cond_t cond;
void init( ) {
head = NULL;
pthread_mutex_init(&mutex, NULL);
pthread_cond_init(&cond, NULL);
}
TASK *get( ) {
pthread_mutex_lock( &mutex );
while(!head)
pthread_cond_wait(&cond, &mutex);
TASK *res = head;
head = head->next;
pthread_mutex_unlock( &mutex );
return res;
}
void add( TASK *t ) {
pthread_mutex_lock( &mutex );
t->next = head;
head = t;
pthread_cond_broadcast( &cond );
pthread_mutex_unlock( &mutex );
}
I do realize this is a LIFO queue, and the next is a FIFO, but I've only included the interesting bits so it's quick and easy to read.
semaphore-based queue:
TASK *buf;
TASK *next_reader;
TASK *next_writer;
TASK *endp;
pthread_mutex_t writer_mutex;
pthread_mutex_t reader_mutex;
sem_t writer_sem;
sem_t reader_sem;
void init( int num_tasks ) {
buf = calloc(sizeof(TASK), num_tasks);
next_reader = buf;
next_writer = buf;
endp = buf + num_tasks;
sem_init(&writer_sem, 0, num_tasks);
sem_init(&reader_sem, 0, 0);
pthread_mutex_init(&writer_mutex, NULL);
pthread_mutex_init(&reader_mutex, NULL);
}
TASK *get( ) {
TASK *res = NULL;
sem_wait(&reader_sem);
pthread_mutex_lock(&reader_mutex);
res = next_reader;
next_reader++;
if(next_reader == endp)
next_reader = buf;
pthread_mutex_unlock(&reader_mutex);
sem_post(&writer_sem);
return res;
}
void add( TASK *t ) {
sem_wait(&writer_sem);
pthread_mutex_lock(&writer_mutex);
*next_writer = *item;
next_writer++;
if(next_writer == endp)
next_writer = buf;
pthread_mutex_unlock(&writer_mutex);
sem_post(&reader_sem);
}
I can't for the life of me see how the change from the condition queue to the semaphore queue would resolve the previous question I posted, unless there's some funky things happening if a thread is closing a socket and pthread_cond_broadcast is called during the close. I'm assuming an os bug because I can't find any documentation condemning what I'm doing. None of the queue actions are called from signal handlers. Here's my distro:
Linux version: 2.6.21.7-2.fc8xen
Centos version: 5.4 (Final)
Thanks
EDIT ---- I just added in the initializations I'm doing. In the actual code, These are implemented in a templated class. I've just included the relevant portions.