I'll try to explain the problem as succinctly as possible. I'm trying to filter some values from a log file coming from Elastic. The log outputs this JSON exactly:
{'took': 2, 'timed_out': False, '_shards': {'total': 1, 'successful': 1, 'skipped': 0, 'failed': 0}, 'hits': {'total': {'value': 2, 'relation': 'eq'}, 'max_score': None, 'hits': [{'_index': 'winlogbeat-dc-2022.10.17-000014', '_type': '_doc', '_id': 'vOCnfoQBeS2JF7giMG9q', '_score': None, '_source': {'agent': {'hostname': 'SRVDC1'}, '@timestamp': '2022-11-16T04:19:13.622Z'}, 'sort': [-9223372036854775808]}, {'_index': 'winlogbeat-dc-2022.10.17-000014', '_type': '_doc', '_id': 'veCnfoQBeS2JF7giMG9q', '_score': None, '_source': {'agent': {'hostname': 'SRVDC1'}, '@timestamp': '2022-11-16T04:19:13.630Z'}, 'sort': [-9223372036854775808]}]}}
Now, I want to filter out only the _index and @timestamp keys. If I assign this JSON to a variable, I can perfectly filter out the two keys by running:
index = (data['hits']['hits'][0]['_index'])
timestamp = (data['hits']['hits'][0]['_source']['@timestamp'])
Output:
winlogbeat-dc*
2022-11-16T04:19:13.622Z
However, if I try to do the same directly from the server call, I get:
Traceback (most recent call last):
File "c:\Users\user\Desktop\PYTHON\tiny2.py", line 96, in <module>
query()
File "c:\Users\user\Desktop\PYTHON\tiny2.py", line 77, in query
index = (final_data['hits']['hits'][0]['_index'])
TypeError: string indices must be integers
Now, I understand the it's asking for integer values instead of the strings I'm using, but if I use integers, then I get individual characters rather than a key/value pair.
What am I missing?
UPDATE: Below is the entire code, but it won't help much. It contains Elastic's DSL query language, and a call to the server, which obviously you won't be able to connect to. I tried your suggestions, but I either get the same error, or a new one:
raise TypeError(f'the JSON object must be str, bytes or bytearray, '
TypeError: the JSON object must be str, bytes or bytearray, not ObjectApiResponse
Entire code as follows:
import os
import ast
import csv
import json
from elasticsearch import Elasticsearch
from datetime import datetime,timedelta
import datetime
ELASTIC_USERNAME = 'elastic'
ELASTIC_PASSWORD = "abc123"
PORT= str('9200')
HOST = str('10.20.20.131')
CERT = os.path.join(os.path.dirname(__file__),"cert.crt")
initial_time = datetime.datetime.now()
past_time = datetime.datetime.now() - (timedelta(minutes=15))
def query():
try: #connection to Elastic server
es = Elasticsearch(
"https://10.20.20.131:9200",
ca_certs = CERT,
verify_certs=False,
basic_auth = (ELASTIC_USERNAME, ELASTIC_PASSWORD)
)
except ConnectionRefusedError as error:
print("[-] Connection error")
else: #DSL Elastic query of Domain Controler logs
query_res = es.search(
index="winlogbeat-dc*",
body={
"size": 3,
"sort": [
{
"timestamp": {
"order": "desc",
"unmapped_type": "boolean"
}
}
],
"_source": [
"agent.hostname",
"@timestamp"
],
"query": {
"bool": {
"must": [],
"filter": [
{
"range": {
"@timestamp": {
"format": "strict_date_optional_time",
"gte": f'{initial_time}',
"lte": f'{past_time}'
}
}
}
],
"should": [],
"must_not": []
}
}
}
)
if query_res:
parse_to_json =json.loads(query_res)
final_data = json.dumps(str(parse_to_json))
index = ast.literal_eval(final_data)['hits']['hits'][0]['_index']
timestamp = ast.literal_eval(final_data)['hits']['hits'][0]['_source']['@timestamp']
columns = ['Index','Last Updated']
rows = [[f'{index}',f'{timestamp}']]
with open("final_data.csv", 'w') as csv_file:
write_to_csv = csv.writer(csv_file)
write_to_csv.writerow(columns)
write_to_csv.writerows(rows)
print("CSV file created!")
else:
print("Log not found")
query()