I've implemented a non-blocking reader in Python, and I need to make it more efficient.
The background: I have massive amounts of output that I need to read from one subprocess (started with Popen()) and pass to another thread. Reading the output from that subprocess must not block for more than a few ms (preferably for as little time as is necessary to read available bytes).
Currently, I have a utility class which takes a file descriptor (stdout) and a timeout. I select()
and readline(1)
until one of three things happens:
- I read a newline
- my timeout (a few ms) expires
- select tells me there's nothing to read on that file descriptor.
Then I return the buffered text to the calling method, which does stuff with it.
Now, for the real question: because I'm reading so much output, I need to make this more efficient. I'd like to do that by asking the file descriptor how many bytes are pending and then readline([that many bytes])
. It's supposed to just pass stuff through, so I don't actually care where the newlines are, or even if there are any. Can I ask the file descriptor how many bytes it has available for reading, and if so, how?
I've done some searching, but I'm having a really hard time figuring out what to search for, let alone if it's possible.
Even just a point in the right direction would be helpful.
Note: I'm developing on Linux, but that shouldn't matter for a "Pythonic" solution.