0

I have a lambda function that converts my logs to this format:

{
    "events": [
        {
            "field1": "value",
            "field2": "value",
            "field3": "value"
        }, (...)
    ]
}

When I query it on S3, I get in this format:

[
    {
        "events": [
            { (...) }
        ]
    }
]

And I'm trying to run a custom classifier for it because the data I want is inside the objects kept by 'events' and not events itself.

So I started with the simplest path I could think that worked in my tests (https://jsonpath.curiousconcept.com/)

$.events[*]

And, sure, worked in the tests but when I run a crawler against the file, the table created includes only an events field with a struct inside it.

So I tried a bunch of other paths:

$[*].events
$[*].['events']
$[*].['events'].[*]
$.[*].events[*]
$.events[*].[*]

Some of these does not even make sense and absolutely every one of those got me an schema with an events field marked as array.

Can anyone point me to a better direction to handle this issue?

FBidu
  • 972
  • 9
  • 21
  • 1
    Below you have the answers on the similar problems. – jbgorski Oct 03 '18 at 19:11
  • 1
    https://stackoverflow.com/questions/52031322/create-athena-table-from-nested-json-source/52081347#52081347 – jbgorski Oct 03 '18 at 19:12
  • 1
    https://stackoverflow.com/questions/51678474/issue-with-regexp-extract-function/52103045#52103045 – jbgorski Oct 03 '18 at 19:13
  • I've found an similar answer before but didn't quite work... For SerDe to work properly I need to send all my data in one-record-per-line or it can query 'standard' JSON as well? – FBidu Oct 03 '18 at 19:14
  • I have also considered leaving the data as it is, with an array of struct and maybe unpack it using a view but I find this a little cumbersome and I believe there must be a way to make jsonpath work properly... – FBidu Oct 03 '18 at 19:15
  • 1
    Try this one: https://stackoverflow.com/questions/52067065/aws-glue-custom-classifiers-json-path/52074078#52074078 – jbgorski Oct 03 '18 at 19:40

0 Answers0