1

I'm currently working on a parser for plaso. For this I need to read journald's binary log files and convert those to a plaso timeline object.

My question now is: How do I read a binary file in python, keeping in mind that the file may contain strings and integers. Is a byte array sufficient for this? If so, how can I find the correct delimiters for the message fields?

Since I'm new to python I can't provide useful code just yet, still trying to wrap my head around this.

  • Is this 2.7 or 3.x? Python made huge changes to the way binary is read. – yxre Mar 20 '15 at 14:59
  • oh, sorry, I'll edit this in. It's python 2.7 – David Mändlen Mar 20 '15 at 15:01
  • Maybe the [Journal File Format](http://www.freedesktop.org/wiki/Software/systemd/journal-files/) doc and [this answer](http://stackoverflow.com/questions/1035340/reading-binary-file-in-python) could help you – Mauro Baraldi Mar 20 '15 at 15:02
  • Thank you for your input, Mauro. I already read the doc, the problem I had is more along the lines of "How do I get my file into something I can use?" And I wasn't sure, if looping over every single byte was the pythonic way to do this. – David Mändlen Mar 20 '15 at 15:08
  • You could always read the journal in JSON format. – Michael Hampton Jun 09 '15 at 01:38

1 Answers1

1

You can deal with binary data using struct package.

If I had been you I would have seen the struct of the file by journald (from journald docs or its source code) and parsed binary data into fields.

Deck
  • 1,969
  • 4
  • 20
  • 41
  • This is what I tried to do. I just wasn't sure how to handle the parsing. Maybe the struct package is the key to this. Thank you. – David Mändlen Mar 20 '15 at 15:07
  • @DavidMändlen yes, reading the binary file is simple itself. You should use proper `format` string according to journald docs. – Deck Mar 20 '15 at 15:09