6

I have a bounded semaphore object that ensures my program doesn't download more than a certain number of files at a time. Each worker thread acquires the semaphore when it starts downloading and releases it when done.

I have another thread that would like to run code when nothing is being downloaded. I would like a method for locking until the semaphore is completely available. How can I do this in Python?

xtofl
  • 40,723
  • 12
  • 105
  • 192
Jacob
  • 1,423
  • 16
  • 29

3 Answers3

3

A pool of workers sounds like a good solution for this, so you might consider this question and its answers. You can create the pool of workers, submit all the jobs, close the pool, and then join it to wait for things to finish before handing off to the code you want to run when the files are done downloading.

Community
  • 1
  • 1
Hank Gay
  • 70,339
  • 36
  • 160
  • 222
  • Thank you for the link. A pool object sounds practical and relevant. I'm still academically interested in the primary question. How do I tell if all of the semaphores are available? – Jacob Jul 11 '12 at 19:33
  • @user831850 A little inspection on the REPL indicates that, at least in CPython 2.6, there is a "private" attribute: `_Semaphore__value` that is the counter the semaphore uses internally. You could compare it's current value to the original value (which I presume you have hanging around somewhere), though you probably want to do some work to make sure everything is atomic where it should be. That sounds like a big hack, though, so maybe I'm missing something more obvious. – Hank Gay Jul 12 '12 at 13:00
  • Presumably I would want to acquire the "private" lock object as well. That's the hack I had been thinking about originally. I figured I must be missing something and so I asked the question. – Jacob Jul 12 '12 at 15:50
1

You should definitely not try to peek into the counter value of the semaphore. Doing this breaks the abstraction of the semaphore. Moreover the value that you read may not even be the correct value because, one - there might be another thread that could change the value of the count before you can actually make use of the read value (depends on the scheduling policy); two - your read is not atomic. So hacking around might work but is not guaranteed to work all the time.

A better thing to do would be to use a counter variable in your program. But care should be taken to synchronize the access to this counter.

CPS
  • 677
  • 2
  • 6
  • 9
0

I am not sure, if I understood it completely, but do you think the following code would work?

from threading import Semaphore
p = Semaphore(5) #Allow 5 times
p.acquire()
download_file()
...
...

In this code, p.acquire() would return True for 5 times and then would cause the call to block. Wouldn't the blocking call would suffice for your purpose?

Sudhir Krishnan
  • 281
  • 3
  • 7
  • I have separate threads downloading at the same time. Each one acquire the semaphore before downloading, and releases the semaphore afterwards. I want to block until no other thread currently has the semaphore acquired. – Jacob Jul 11 '12 at 19:29
  • The only way I can think of is to have a global variable count and then in each thread, have a separate lock that will be used to synchronize this variable. As the threads acquire semaphore, it will increment count and when released, it will decrement count. In the third thread, you would check for value of count. You would want to acquire only when count is 0, right? Hope it helps. – Sudhir Krishnan Jul 11 '12 at 19:47