0

I have a collection of event logs that I want to access, re-format, and print them in a cleaner, more readable format. An example of such a log would be:

[
  {
    "_source": {
      "message": "<message>",
      "tags": [
        "winlog",
        "2.4.0",
        "agents",
        "agents_input_codec_plain_applied"
      ],
      "@timestamp": "2023-08-22T15:11:14.146Z",
      "observer": {
        "ip": "<ip>"
      },
      "winlog": {
        "task": "Account Lockout",
        "computer_name": "<name>",
        "event_data": {
          "IpAddress": "<ip>",
          "TargetUserName": "<username>",
          "LogonType": "3",
          "SubjectUserName": "-",
          "TargetDomainName": "<domain>",
          "LogonTypeName": "network"
        },
        "keywords": [
          "Audit Failure"
        ],
        "event_id": <id>
      },
      "log": {
        "level": "information"
      },
      "event": {
        "action": "Account Lockout",
        "created": "2023-08-22T15:11:15.378Z",
        "code": 4625,
        "kind": "event"
      },
    "fields": {
      "@timestamp": [
        "2023-08-22T15:11:14.146Z"
      ],
      "event.created": [
        "2023-08-22T15:11:15.378Z"
      ]
    },
    "sort": [
      1692717074146
    ]
  },
   <next event log>
]

I attempted to open the log file, load the json, and iterate the file, accessing log information, then printing it but I recieve an error and wonder if there is a work around.

with open('logs.json', 'r') as json_file:
    logs = json.load(json_file)

for i in logs['event']:
    print(i['action'])

Error: list indices must be integers or slices, not str.

for in logs:
    print(i)

This code works, but just prints all logs out verbatim.

for i in logs:
    print(logs[i]["event"])

This throws the same error as the first example.

Given the specific log format and file, How can I access individual data? Particularly I am trying to access data in winlog and event.

Karl Knechtel
  • 62,466
  • 11
  • 102
  • 153
grayhydra
  • 1
  • 1
  • 1
    your json data is a list, so you have to write some python code that expects `logs` to be a list – Anentropic Aug 23 '23 at 17:47
  • the last example is closest: `for i in logs:` this fails because when you iterate over a list in Python the loop variable takes the value of each item in the list (i.e. not an index number) so `logs[i]` fails... instead you could do `for log in logs:` and `print(log['event'])` – Anentropic Aug 23 '23 at 17:49
  • except it would have to be `print(log["_source"]["event"]` because that's how your json data is structured – Anentropic Aug 23 '23 at 17:50
  • 1
    This has **nothing to do with** JSON. Once you have successfully parsed the JSON data, you **just** have a **perfectly ordinary** nested data structure of dicts and lists, which you work with **the exact same way** as if you had gotten **the same data, by any other means**. Think carefully about **what result you get** for `i` when doing `for i in logs:`, and then think about **how to get what you want, from** that `i` result. – Karl Knechtel Aug 23 '23 at 18:01

1 Answers1

0

The parsing part is easy - you already successfully called json.load to load the data as python data structures. We can open the python shell and experiment

>>> type(logs)
<class 'list'>

logs is a list which gives the list members (not their indexes) when iterated. Might as well call each of these members "log". We can grab the first log. The contained objects are dictionaries and we can look at their keys to see how to reference the structure.

>>> log = logs[0]
>>> log.keys()
dict_keys(['_source'])
>>> source =log["_source"]
>>> source.keys()
dict_keys(['message', 'tags', '@timestamp', 'observer', 'winlog', 'log', 'event', 'fields', 'sort'])
>>> event=source["event"]
>>> event.keys()
dict_keys(['action', 'created', 'code', 'kind'])

So, if you want to look at events per log, do

>>> for log in logs:
...     event = log["_source"]["event"]
...     print(event)
... 
{'action': 'Account Lockout', 'created': '2023-08-22T15:11:15.378Z', 'code': 4625, 'kind': 'event'}
tdelaney
  • 73,364
  • 6
  • 83
  • 116