QSerialPort::waitForBytesWritten "subtle errors" with an event loop?

Question

In an answer on another question, it's claimed that when using a main Qt event-loop with QSerialPort, the QSerialPort::waitForBytesWritten can cause "subtle errors" because it opens another event loop.

I'm using the waitForBytesWritten method to ensure that my main event-loop is not re-entered until all bytes have been written. Is this not a valid way to do this? What "subtle errors" can occur? Should I use flush() instead, and will this ensure that my event-loop is not entered again until all pending bytes have been written?

I am seeing some strange behavior in my application related to its serial-port communications, but I'm not sure what "layer" of communications is causing the issue, so it may not even be related to my use of QSerialPort. At the moment, the only hint I have regarding its origin is that the QSerialPort emits a ResourceError shortly before I see the poor behavior start. Can this ResourceError be related to my use of waitForBytesWritten?

I'm using Qt 5.5.1 on a Debian 7 system.

It's hard to tell why the `ResourceError` happens without seeing some code. — Kuba hasn't forgotten Monica, May 27 '16 at 18:15
@KubaOber I'm not sure how much code I can provide in this case, because it's a fairly complex app and it's part of a proprietary product. Even if I trim it down to some smaller example, I'm not sure which parts are relevant, and I can't create a minimal reproducible case because the communications layer, the process on the other side of the serial communications channel, etc are all proprietary. — Kyle Strand, May 27 '16 at 18:20
.......but even some idea of what, exactly, can even trigger a `ResourceError` on a hardware serial port would help. — Kyle Strand, May 27 '16 at 18:21
As an aside: Please explain why you think you need to `waitForBytesWritten`: what do you think is the meaning of that operation in terms of the state prior to and after invoking it. What are you trying to achieve by calling it? — Kuba hasn't forgotten Monica, May 27 '16 at 21:05
@KubaOber Primarily to send acknowledgments of valid packets as soon as possible, once I've read them. For this protocol an acknowledgment is only 4 or 5 bytes, so the idea was to write them to the underlying device driver *immediately*. — Kyle Strand, May 27 '16 at 22:42
@KubaOber Ah, there's another reason I'd forgotten--my transport layer is using the `QSerialPort` via a pointer to `QIODevice`. So I don't have access to the `flush` method where I'm making the `waitForBytesWritten` call. — Kyle Strand, May 27 '16 at 22:46
You can do this, then: `if (qobject_cast(dev)) qobject_cast(dev)->flush() else ...;` — Kuba hasn't forgotten Monica, May 30 '16 at 16:18
`waitForBytesWritten(0)` does the immediate writing and doesn't block :) — Kuba hasn't forgotten Monica, May 30 '16 at 16:19

score 1 · Accepted Answer · edited Jun 20 '20 at 09:12

On Unix, the read notification maps EAGAIN, EIO, and EBADF to ResourceError and stops processing of further read notifications until you invoke a waitFor function or until you reopen the port. This makes the QSerialPort useless for reads until you reopen or waitFor. It makes sense to map EIO and EBADF to the same error perhaps. But, since EAGAIN is benign, it looks like a Qt bug to treat it that way.

To see if that's the problem, you could clear the error and reinvoke waitForReadyRead(0) or waitForBytesWritten(0) and see if the error reappears:

bool retryWaitForBytesWritten(int n, int retries = 10) {
  if (!dev.bytesToWrite()) return false;
  while (retries-- && !dev.waitForBytesWritten(n)) {
    if (dev.error() != QSerialPort::ResourceError)
      return false;
    dev.clearError(); // retry if it was a resource error
  }
  return true;
}

You mainly should not use waitForBytesWritten because:

It doesn't do what you perhaps think it does. All that it ensures is that the internal write buffer in the QSerialPort has been emptied by feeding the data to the operating system using the write syscall. This doesn't mean that the data has been physically sent.
It potentially blocks. It's not guaranteed to block.
Getting informed of when a write is finished is usually unnecessary. If you have a half-duplex (essentially) command-response protocol, you don't care when you finished sending, but when the response has arrived or timed out.

If you have a streaming protocol where you send poll commands and don't want to generate them faster than the device can consume them, you should issue the commands (and mark them as "issued" in a data structure) until the bytesToWrite is past a high watermark. You then suspend adding new commands until you're informed by the bytesWritten signal that the send buffer is at a low watermark. The readyRead slot matches up the responses to the issued commands and indicates them appropriately.

So, ideally, your code should contain zero waitXxx calls. The world around us is asynchronous. You don't wait for things, you react to them. Don't stop the event loop until something happens - that's backwards. Other things may happen before that thing you wait on, and you're ignoring them till then. Even if you push the I/O to a separate thread, you're still wasting a whole thread just blocking. Handling events that redirect flow of control is also hard in sequential pseudo-synchronous code - everything ends up littered with (necessary!) error checks. Sometimes people resort to using exceptions for that, which might be OK, but gets very hard to do correctly in large code bases, and anytime you wish to redirect control to code that wasn't on the call stack. It's much easier to reason about the correctness of your code when faced with the asynchronous events, such as incoming data or errors, when you explicitly express it as a hierarchical state machine.

Design your system so that it progresses based on the events that happen asynchronously. If some aspect of your UI must indicate the state of the data being sent, just make it so: don't wait, don't block the thread.

This answer demonstrates how to implement asynchronous I/O for a device based on QIODevice, using the state machine framework. Once you've got your states set up, it's a relatively easy matter to tie them to UI behaviors. See these answers: one, two, three, four, five, six, seven, eight.

Also, I was wrong and I fixed the other answer. QSerialPort does not re-enter the event loop, neither do the sockets in the Network module.

I don't understand why you're saying that it's *always* wrong to use a function "because it blocks". That's *why* I'm using it; I *want* to block until the bytes are written. I only this to write about 8 or 10 bytes at a time (usually only 4), I want the bytes to be written literally ASAP, and if there's an error I want to know immediately. Trust me when I say that there is nothing else in my application that could possibly be more important than writing those bytes at the time I make the `waitForBytesWritten` call, including timers, etc. — Kyle Strand, May 27 '16 at 19:18
@KyleStrand That's OK, it's not the source of your issue anyway. I think you're hitting a genuine Qt bug. You could argue that writing the code in asynchronous is an issue of style, my experience is that it leads to horrible spaghetti as soon as your communications are less than trivial. Even for simple protocols, asynchronous way can lead to more declarative style. But that's beside the point here. — Kuba hasn't forgotten Monica, May 27 '16 at 19:22
Thanks, this is very helpful so far--and I think you're right about `EAGAIN`; this just means that there's nothing available to read at the time an asynchronous read-operation completes, correct? (Or...something like that?) There's still one part you haven't answered--assuming I *do* want to block on a small "write" operation, is `flush` preferable to `waitForBytesWritten`? — Kyle Strand, May 27 '16 at 20:28
@KyleStrand `flush()` is preferable. There's little point in you waiting for it if you're not streaming data - since it won't ensure that the port has actually transmitted any data. On an old Athlon XP Linux box, and on a recent OS X, both with PCI hardware ports I've just tried it on, `flush()` returns **before** any externally visible change has been done to the port pin states. If you are streaming data, then you can enforce buffer watermarks by reacting to `bytesWritten` signal anyway. Those are not device buffers. They are Qt buffers! — Kuba hasn't forgotten Monica, May 27 '16 at 21:01
@KyleStrand Note that when you block, the CPU core is generally reassigned to do *other threads' work* if they are runnable. On that core, your threads' data goes out from the caches etc. If you were reacting to asynchronous signals, you could ensure that if your thread is not dealing with the ports, it can do some other work for *your process* instead of for some other process - if such work is available to be done, of course. This may not be important at all in your case, of course, it's just an observation. Blocking does relinquish wall time *irrevocably* on fully loaded cores. — Kuba hasn't forgotten Monica, May 27 '16 at 21:08
All the better, actually, in my case. Thanks for the thoughts and the information! — Kyle Strand, May 28 '16 at 22:55
And it looks like closing/re-opening the port when I see a ResourceError fixes my issue. — Kyle Strand, May 28 '16 at 22:59
@KyleStrand You don't have to close/reopen. Reissuing a `waitFor` should fix it too. It'd be nice if you could check that. — Kuba hasn't forgotten Monica, May 30 '16 at 16:16
Well, since I was already using `waitForBytesWritten`, that definitely didn't resolve my issue. But would you expect `waitForReadyRead` to work? — Kyle Strand, May 30 '16 at 18:42
@KyleStrand Both close/reopen and `waitForBytesXxx` restart the listener. The key is in reacting to an error: you must do the zero time wait when the error has been signaled. — Kuba hasn't forgotten Monica, May 30 '16 at 18:48

QSerialPort::waitForBytesWritten "subtle errors" with an event loop?

1 Answers1