260

I need a queue which multiple threads can put stuff into, and multiple threads may read from.

Python has at least two queue classes, Queue.Queue and collections.deque, with the former seemingly using the latter internally. Both claim to be thread-safe in the documentation.

However, the Queue docs also state:

collections.deque is an alternative implementation of unbounded queues with fast atomic append() and popleft() operations that do not require locking.

Which I guess I don't quite unterstand: Does this mean deque isn't fully thread-safe after all?

If it is, I may not fully understand the difference between the two classes. I can see that Queue adds blocking functionality. On the other hand, it loses some deque features like support for the in-operator.

Accessing the internal deque object directly, is

x in Queue().deque

thread-safe?

Also, why does Queue employ a mutex for it's operations when deque is thread-safe already?

Federico Baù
  • 6,013
  • 5
  • 30
  • 38
miracle2k
  • 29,597
  • 21
  • 65
  • 64
  • `RuntimeError: deque mutated during iteration` is what you could be getting is using a shared `deque` between several thread and no locking... – toine Nov 07 '15 at 16:28
  • 8
    @toine that doesn't have anything to do with threads though. You can get this error whenever you add/delete to a `deque` while iterating even in the same thread. The only reason you can't get this error from `Queue` is that `Queue` doesn't support iteration. – max Feb 22 '17 at 00:16
  • If you have the book "Effective Python", there's a really nice comparison of Queue vs deque in a multi-threaded use case in Item 55 ("Use Queue to Coordinate Work Between Threads"). – kennysong Aug 13 '21 at 04:17

8 Answers8

401

Queue.Queue and collections.deque serve different purposes. Queue.Queue is intended for allowing different threads to communicate using queued messages/data, whereas collections.deque is simply intended as a datastructure. That's why Queue.Queue has methods like put_nowait(), get_nowait(), and join(), whereas collections.deque doesn't. Queue.Queue isn't intended to be used as a collection, which is why it lacks the likes of the in operator.

It boils down to this: if you have multiple threads and you want them to be able to communicate without the need for locks, you're looking for Queue.Queue; if you just want a queue or a double-ended queue as a datastructure, use collections.deque.

Finally, accessing and manipulating the internal deque of a Queue.Queue is playing with fire - you really don't want to be doing that.

Keith Gaughan
  • 21,367
  • 3
  • 32
  • 30
  • If you need a thread-safe queue and performance is important, `deque` seems a better option than `Queue`: see Jonathan's answer. Plus with `deque` you get list-like flexibility and visibility as a bonus, although that is not all thread-safe. – fantabolous Dec 07 '15 at 07:30
  • 22
    No, that's not a good idea at all. If you look at the source of `Queue.Queue`, it uses `deque` under the hood. `collections.deque` is a collection, while `Queue.Queue` is a communications mechanism. The overhead in `Queue.Queue` is to make it threadsafe. Using `deque` for communicating between threads will only lead to painful races. Whenever `deque` happens to be threadsafe, that's a happy accident of how the interpreter is implemented, and *not* something to be relied upon. That's why `Queue.Queue` exists in the first place. – Keith Gaughan Dec 07 '15 at 14:02
  • The docs say `deque` is thread-safe for pops and appends, which are all you need for a simple queue, perhaps wrapping `pop`s with a `try: except IndexError` for when it's empty. Some interpreters disobey this? It's true I stick with cpython. I use `deque` myself because `Queue` is way too slow for my purposes. https://docs.python.org/2/library/collections.html#deque-objects – fantabolous Dec 08 '15 at 01:22
  • 3
    Just keep in mind that if you're communicating across threads, you're playing with fire by using deque. deque is threadsafe *by accident* due to the existence of the GIL. An GIL-less implementation will have completely different performance characteristics, so discounting other implementations isn't wise. Besides, have you timed Queue vs deque for use across threads as opposed to a naive benchmark of its use in a single thread? If your code is *that* sensitive to the speed of Queue vs deque, Python might not be the language you're looking for. – Keith Gaughan Dec 09 '15 at 17:24
  • Ok, thanks for the info. Yes, my benchmarks are across threads. True python isn't the fastest language but I use it because of `pandas`. – fantabolous Dec 10 '15 at 05:38
  • 15
    @KeithGaughan `deque is threadsafe by accident due to the existence of GIL`; it is true that `deque` relies on GIL to ensure thread-safety - but it's not `by accident`. The official python documentation clearly states that `deque` `pop*`/`append*` methods are thread-safe. So any valid python implementation must provide the same guarantee (GIL-less implementations will have to figure out how to do that without GIL). You can safely rely on those guarantees. – max Feb 22 '17 at 00:20
  • 4
    @fantabolous my previous comment notwithstanding, I don't quite understand how you would use `deque` for communication. If you wrap `pop` into a `try/except`, you will end up with a busy loop eating up enormous amount of CPU just waiting for new data. This seems like a horribly inefficient approach compared to the blocking calls offered by `Queue`, which ensure that the thread waiting for data will go to sleep and not waste CPU time. – max Feb 22 '17 at 00:22
  • 4
    You might want to take a read of the source code for `Queue.Queue` then, because it's written using `collections.deque`: https://hg.python.org/cpython/file/2.7/Lib/Queue.py - it uses condition variables to efficiently allow the `deque` it wraps to be accessed over thread boundaries safely and efficiently. The explanation of how you'd use a `deque` for communication is right there in the source. – Keith Gaughan Mar 07 '17 at 20:56
  • brian-brazil and @KeithGaughan both of your answers contradict each other as well as the python documentation. The doc I'm reading says both are thread safe. So who is right in this case? – Peter Moore Oct 07 '21 at 17:25
  • `collections.deque` is _technically_ threadsafe, but only on CPython due to how the GIL works, and even then it's not _guaranteed_ to be so. `queue.Queue` (because Python 3 exists now) is _guaranteed_ to be threadsafe. If you want a threadsafe queue, you use the latter. To quote the docs: "It is especially useful in threaded programming when information must be exchanged safely between multiple threads." There's no debate here. – Keith Gaughan Oct 08 '21 at 20:25
63

If all you're looking for is a thread-safe way to transfer objects between threads, then both would work (both for FIFO and LIFO). For FIFO:

Note:

  • Other operations on deque might not be thread safe, I'm not sure.
  • deque does not block on pop() or popleft() so you can't base your consumer thread flow on blocking till a new item arrives.

However, it seems that deque has a significant efficiency advantage. Here are some benchmark results in seconds using CPython 2.7.3 for inserting and removing 100k items

deque 0.0747888759791
Queue 1.60079066852

Here's the benchmark code:

import time
import Queue
import collections

q = collections.deque()
t0 = time.clock()
for i in xrange(100000):
    q.append(1)
for i in xrange(100000):
    q.popleft()
print 'deque', time.clock() - t0

q = Queue.Queue(200000)
t0 = time.clock()
for i in xrange(100000):
    q.put(1)
for i in xrange(100000):
    q.get()
print 'Queue', time.clock() - t0
Jonathan Livni
  • 101,334
  • 104
  • 266
  • 359
  • 1
    You claim that "Other operations on `deque` may not be thread safe". Where do you get that from? – Matt Apr 23 '14 at 18:28
  • @Matt - rephrased to better convey my meaning – Jonathan Livni Apr 28 '14 at 21:36
  • 3
    Ok, thanks. That was stopping me from using deque because I thought you knew something I didn't. I guess I'll just assume it's thread safe until I discover otherwise. – Matt Apr 29 '14 at 13:17
  • @Matt "The deque's append(), appendleft(), pop(), popleft(), and len(d) operations are thread-safe in CPython." source: https://bugs.python.org/issue15329 – Filippo Vitale Apr 20 '20 at 19:25
  • 2
    The docs for new versions of both Python 2 and 3 state that "Deques support thread-safe, memory efficient appends and pops from either side of the deque with approximately the same O(1) performance in either direction." – Elias Hasle May 06 '21 at 10:34
  • # python 3 # deque 0.0154882 # Queue 0.4530072 # deque 0.0149755 # LifoQueue 0.350331 import time import queue import collections ##### Queue (FIFO) Test ###### q = collections.deque() t0 = time.perf_counter() for i in range(100000): q.append(1) for i in range(100000): q.popleft() print ('deque', time.perf_counter() - t0) q = queue.Queue(200000) t0 = time.perf_counter() for i in range(100000): q.put(1) for i in range(100000): q.get() print( 'Queue', time.perf_counter() - t0) – Jichao Dec 02 '21 at 03:23
13

For information there is a Python ticket referenced for deque thread-safety (https://bugs.python.org/issue15329). Title "clarify which deque methods are thread-safe"

Bottom line here: https://bugs.python.org/issue15329#msg199368

The deque's append(), appendleft(), pop(), popleft(), and len(d) operations are thread-safe in CPython. The append methods have a DECREF at the end (for cases where maxlen has been set), but this happens after all of the structure updates have been made and the invariants have been restored, so it is okay to treat these operations as atomic.

Anyway, if you are not 100% sure and you prefer reliability over performance, just put a like Lock ;)

BadWolf
  • 355
  • 3
  • 5
11

All single-element methods on deque are atomic and thread-safe. All other methods are thread-safe too. Things like len(dq), dq[4] yield momentary correct values. But think e.g. about dq.extend(mylist): you don't get a guarantee that all elements in mylist are filed in a row when other threads also append elements on the same side - but thats usually not a requirement in inter-thread communication and for the questioned task.

So a deque is ~20x faster than Queue (which uses a deque under the hood) and unless you don't need the "comfortable" synchronization API (blocking / timeout), the strict maxsize obeyance or the "Override these methods (_put, _get, ..) to implement other queue organizations" sub-classing behavior, or when you take care of such things yourself, then a bare deque is a good and efficient deal for high-speed inter-thread communication.

In fact the heavy usage of an extra mutex and extra method ._get() etc. method calls in Queue.py is due to backwards compatibility constraints, past over-design and lack of care for providing an efficient solution for this important speed bottleneck issue in inter-thread communication. A list was used in older Python versions - but even list.append()/.pop(0) was & is atomic and threadsafe ...

kxr
  • 4,841
  • 1
  • 49
  • 32
6

Adding notify_all() to each deque append and popleft results in far worse results for deque than the 20x improvement achieved by default deque behavior:

deque + notify_all: 0.469802
Queue:              0.667279

@Jonathan modify his code a little and I get the benchmark using cPython 3.6.2 and add condition in deque loop to simulate the behaviour Queue do.

import time
from queue import Queue
import threading
import collections

mutex = threading.Lock()
condition = threading.Condition(mutex)
q = collections.deque()
t0 = time.clock()
for i in range(100000):
    with condition:
        q.append(1)
        condition.notify_all()
for _ in range(100000):
    with condition:
        q.popleft()
        condition.notify_all()
print('deque', time.clock() - t0)

q = Queue(200000)
t0 = time.clock()
for _ in range(100000):
    q.put(1)
for _ in range(100000):
    q.get()
print('Queue', time.clock() - t0)

And it seems the performance limited by this function condition.notify_all()

collections.deque is an alternative implementation of unbounded queues with fast atomic append() and popleft() operations that do not require locking. docs Queue

fantabolous
  • 21,470
  • 7
  • 54
  • 51
nikan1996
  • 61
  • 1
  • 2
2

deque is thread-safe. "operations that do not require locking" means that you don't have to do the locking yourself, the deque takes care of it.

Taking a look at the Queue source, the internal deque is called self.queue and uses a mutex for accessors and mutations, so Queue().queue is not thread-safe to use.

If you're looking for an "in" operator, then a deque or queue is possibly not the most appropriate data structure for your problem.

n611x007
  • 8,952
  • 8
  • 59
  • 102
brian-brazil
  • 31,678
  • 6
  • 93
  • 86
  • 1
    Well, what I want to do is make sure that no duplicates are added to the queue. Is this not something a queue could potentially support? – miracle2k Apr 04 '09 at 15:06
  • 1
    It'd probably be best to have a separate set, and update that when you add/remove something from the queue. That'll be O(log n) rather than O(n), but you'll have to be careful to keep the set and queue in sync (i.e. locking). – brian-brazil Apr 04 '09 at 15:41
  • A Python set is a hash table, so it would be O(1). But yes, you would still have to do locking. – AFoglia Jan 18 '12 at 15:27
  • @brian-brazil and KeithGaughan both of your answers contradict each other as well as the python documentation. The doc I'm reading says both are thread safe. So who is right in this case? – Peter Moore Oct 07 '21 at 17:25
1

(seems I don't have reputation to comment...) You need to be careful which methods of the deque you use from different threads.

deque.get() appears to be threadsafe, but I have found that doing

for item in a_deque:
   process(item)

can fail if another thread is adding items at the same time. I got an RuntimeException that complained "deque mutated during iteration".

Check collectionsmodule.c to see which operations are affected by this

  • this kind of error is not special for threads and principal thread-safety. It happens e.g. by just doing `>>> di = {1:None} >>> for x in di: del di[x]` – kxr Apr 19 '17 at 15:23
  • 3
    Basically you should never loop over something that might be modified by another thread (although in some cases you could do it by adding your own protection). Like a Queue, you're intended to pop/get an item off the queue before processing it, and you would normally do that with a `while` loop. – fantabolous Oct 25 '19 at 09:20
0

I realize the opening question is asking about differences between queue and deque. Others have provided lots of good feedback on these, I wanted to give a different perspective.

I was using threading to listen to ports, and using a queue to get the received traffic back to the main loop. Originally I was using queue and observed performance problems with it. I replaced queue with deque and saw a modest performance increase.

In the end, the biggest performance increase was from getting rid of threading and queues entirely. socket.socket has a settimeout argument that I used to achieve the same behavior I was depending on from queue and deque, which is to raise an exception if it times out or there's nothing to get.

Wayne Workman
  • 449
  • 7
  • 14