I have an upstream application sending following JSON message to an SNS topic. We use AWS lambda function to save this JSON object in S3:
{
"processResult": {
"processName": "XYZ",
"stageResults": {
"Read Files": {
"status": "PROCESSED",
"error": "",
"timeTaken": 26064469473
},
"Convert Files": {
"status": "PROCESSED",
"error": "",
"timeTaken": 97968896
}
},
"processMetrics": {
"filesProcessed": 1157,
"filesWithExceptionCount": 1,
"timeTaken": "367.460031s",
"metricsCalcTime": "6.061847s",
"totalTimeTaken": "373.521878s"
},
"succeeded": true
}
}
If you observe within the "stageResults" STRUCT, I have the attributes with spaces, like "Read Files" and "Convert Files" and when i am trying to read this JSON from S3 using AWS Athena (AWS Glue Crawler), I am getting following error:
HIVE_METASTORE_ERROR: com.facebook.presto.spi.PrestoException: Error: : expected at the position 51 of 'struct<....
......but ' ' is found. (Service: null; Status Code: 0; Error Code: null; Request ID: null)
But when I manually edit the JSON message by modifying "stageResults" STRUCT like "Read_Files" and "Convert_Files", I was able to read and query the JSON using AWS Athena tables successfully.
Following is the piece of code from AWS Lambda which triggers on SNS topic, reads JSON message and saves it in S3:
import json
import boto3
import random
import string
file_name = ''.join([random.choice(string.ascii_lowercase) for i in range(16)])
def lambda_handler(event, context):
target_bucket = 'bucket-name'
target_key = 'input=clientdata/' + file_name + '.json'
s3 = boto3.resource('s3')
for record in event['Records']:
payload = record["body"]
data = json.loads(payload)
print(data)
print("copying JSON message...")
s3.Object('target_bucket', 'target_key').put(
Body=(bytes(json.dumps(data).encode('UTF-8')))
)
Now, I want to know how to modify this JSON message before saving it to S3 using AWS Lambda. Any help is appreciated.