1

I want to write a Python program on Linux that reads a log file in real time as it is being written, for the purpose of sending an alarm if it detects certain things in the log. I want this to use asyncio for several reasons - I'm trying to build a framework that does many things at the same time based on asyncio, and I need the practice.

Since I'm using asyncio, I obviously don't want to use a blocking read to wait at the end of the input file for more lines to be written to it. I suspect I'll have to end up using select, but I'm not sure.

I suspect that this is pretty simple, but I have a hard time finding an example of how to do this, or coming up with one of my own even though I've dabbled a little bit in asyncio before. I can read and mostly understand other asyncio examples I find, but for some reason I find it difficult to write asyncio code of my own.

Therefore, I'd be very grateful if someone could point me to an example. Bonus points if the same technique also works for reading from stdin rather than a file.

Enfors
  • 960
  • 2
  • 14
  • 25
  • 2
    `with open()` supports this. every new `read()` will continue from the last cursor position within the file. if there's new data, a `read()` will fetch it. `f.tell()` will tell you where you are in the file. – Torxed Feb 02 '19 at 11:26
  • `select` is probably your best bet, or checking the `os.stat(f).m_time` of the file. Or, use one of the methods mentioned [here](https://stackoverflow.com/questions/182197/how-do-i-watch-a-file-for-changes) to determinate when you should do the read. I'm not a big fan of asyncio, but maybe it's worth a look at. – Torxed Feb 02 '19 at 11:34
  • Won't `read()` just immediately return `EOF`, or something? – Enfors Feb 02 '19 at 11:35
  • 1
    Have ye faith, little one: https://i.imgur.com/IHmrCyu.png (notice the 6 at the end, and that `read()` isn't blocking. It will continue by returning `f.data[f.curpos:]`, which will be empty at times. – Torxed Feb 02 '19 at 11:41
  • I see, thanks. Yeah, that is workable I suppose. I was expecting that I'd be using something like 'await fh.async_read()` or some such. – Enfors Feb 02 '19 at 11:45
  • `tail -f` uses a sleep between retrying reads, it doesn't block on a read. – cdarke Feb 02 '19 at 11:50

1 Answers1

1

I suspect I'll have to end up using select, but I'm not sure. I suspect that this is pretty simple, but I have a hard time finding an example of how to do this

With asyncio, the idea is that you don't need to select() yourself because asyncio selects for you - after all, a select() or equivalent is at the heart of every event loop. Asyncio provides abstractions like streams that implement a coroutine facade over the async programming model. There are also the lower-level methods that allow you to hook into select() yourself, but normally you should work with streams.

In case of tail -f, you can't use select() because regular files are always readable. When there is no data, you get an EOF and are expected to try again later. This is why tail -f historically used reads with pauses, with the option to deploy notification APIs like inotify where available.

user4815162342
  • 141,790
  • 18
  • 296
  • 355