0

Context:

  • I am trying to have two processes writing to the same array in the shared memory space.

  • Each process will write half of the array with a for loop.

  • The first element of the array will always store the index of the next element to be written.

  • IPC is done through a semaphore in the shared memory.

Preconditions:

  1. Both the array and semaphore are properly set up in the shared memory.
  2. The programme works fine if I wait sem_wait and sem_post outside of the for loop, meaning make the whole process atomic. (This is also the reason why I believe the semaphore and array has been set up properly)

Problem

However, when I try to reduce the critical region by putting sem_wait and sem_post into the for loop. It is not sync as there are part of the array which are not written. But the two processes finished their loops where the total loop counts should be equal to the array length.

Many thanks to suggestions why this happens???


UPDATE

On OS X, sem_init() is not working as expected. Used sem_open() to solve the problem. Reference: http://lists.apple.com/archives/darwin-dev//2008/Oct/msg00044.html

Community
  • 1
  • 1
Xu Chen
  • 407
  • 4
  • 10
  • It's not clear to me as to what problem you are encountering - it is not sync - isn't very informative. Also you really should provide a minimum working code example, trying to work out what is wrong from just the pseudo code isn't really possible in most cases. Also creating working code that re-creates the problem is often the fastest way to actually solve the problem yourself! – Jackson Sep 28 '16 at 09:06
  • @Jackson thanks for ur feedbacks. in fact, i would say the working code snippets is 90% similar to the pseudo code. And without including all the set-up, it is not possible to have some working codes. Thats why i did not just copy paste my codes. Elaborated more on the problem. – Xu Chen Sep 28 '16 at 09:13
  • Without a working code example we can't see if your handling errors correctly, creating the semaphore correctly or doing something else that is causing you a problem. Your unlikely to get a sensible answer and your question will end up closed. If you haven't already done so read the help sections on asking questions and creating minimal working examples. – Jackson Sep 28 '16 at 09:20
  • The problem with pseudo code is that is only shows the expected processing, which seems correct here. So the problem should hide in unshowed details, like is `array` volatile, or how is the semaphore initialized? – Serge Ballesta Sep 28 '16 at 09:21
  • I'd also suggest looking at http://stackoverflow.com/questions/16400820/c-how-to-use-posix-semaphores-on-forked-processes – Jackson Sep 28 '16 at 09:26

2 Answers2

0

In psuedo code for problem case, the parent thread requires semaphore as soon as it is released by itself. Now since, the semaphore is available, parent will go on executing. When its time slice expires, the kernel may switch to child process, but it is waiting for semaphore. The semaphore is acquired by parent process. So again parent will continue to execute.

In this scenario, parent executed twice and child did not execute.

This might have resulted in out of sync execution.

0

Looking at your actual code I think the problem is with how you are creating your semaphore. If you read the man page on sem_init:

If pshared is nonzero, then the semaphore is shared between processes, and should be located in a region of shared memory (see shm_open(3), mmap(2), and shmget(2)). (Since a child created by fork(2) inherits its parent's memory mappings, it can also access the semaphore.) Any process that can access the shared memory region can operate on the semaphore using sem_post(3), sem_wait(3), and so on.

The key here is the text in bold - your semaphore isn't in shared memory so it's not really shared between the two processes hence the contention you are seeing.

Jackson
  • 5,627
  • 2
  • 29
  • 48
  • The semaphore is indeed created in the shared memory inside the function newSemaphore function. I indicated that in the comment. Also, if a semaphore is not init properly. The previous greedy version should not be working as well. – Xu Chen Sep 29 '16 at 00:45
  • In that case what is being returned from the calls to sem_wait() and sem_post()? If the calls are failing in the child you aren't detecting this and the program will proceed as if it has the lock when it does not. That would be the next thing to check. – Jackson Sep 29 '16 at 08:29
  • I found out the problem. sem_init, which was used originally, is not working on OS X properly. So I changed to sem_open, and problem solved. Thank you for pointing out the error message checking! – Xu Chen Sep 29 '16 at 15:19