It's certainly possible to specify which queryable fields you expect the payload
to contain and what those fields' mappings should be.
Let's say each doc will include the fields payload.livemode
and payload.created_at
. If these are the only two fields you'll want to perform queries on, and you'd like to disable dynamic, index-time mappings autogenerated by Elasticsearch for the rest of the fields, you can use dynamic templates like so:
PUT my-payload-index
{
"mappings": {
"dynamic_templates": [
{
"variable_payload": {
"path_match": "payload",
"mapping": {
"type": "object",
"dynamic": false,
"properties": {
"created_at": {
"type": "date",
"format": "yyyy-MM-dd HH:mm:ss"
},
"livemode": {
"type": "boolean"
}
}
}
}
}
],
"properties": {
"name": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword"
}
}
},
"origin": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword"
}
}
}
}
}
}
Then, as you ingest your docs:
POST my-payload-index/_doc
{
"name": "abc",
"origin": "web.dev",
"payload": {
"created_at": "2021-04-05 08:00:00",
"livemode": false,
"abc":"def"
}
}
POST my-payload-index/_doc
{
"name": "abc",
"origin": "web.dev",
"payload": {
"created_at": "2021-04-05 08:00:00",
"livemode": true,
"modified_at": "2021-04-05 09:00:00"
}
}
and verify with
GET my-payload-index/_mapping
no new mappings will be generated for the fields payload.abc
nor payload.modified_at
.
Not only that — the new fields will also be ignored, as per the documentation:
These fields will not be indexed or searchable, but will still appear in the _source
field of returned hits.
Side note: if fields are neither stored nor searchable, they're effectively the opposite of enabled
.
The Big Picture
Working with variable contents of a single, top-level object is quite standard. Take for instance the stripe event
object — each event has an id
, an api_version
and a few other shared params. Then there's the data
object that's analogous to your payload
field.
Now, all is fine, until you need to aggregate on the contents of your payload
. See, since the content is variable, so are the data paths / accessors. But wildcards in aggregation paths don't work in Elasticsearch. Scripts do but are onerous to maintain.
Back to stripe. They partially solved it through what they call polymorphic, typed hashes — as discussed in their blog on API design:

A pretty neat approach that's worth emulating.
P.S. I discuss dynamic templates in more detail in the chapter "Mapping Automation" of my ES Handbook.