I'm pretty new on Elasticsearch world and I might be missing some concept.
That's the scenario I'm not understanding:
I want to find a doc from the following criteria:
- category.level = A
- category.name = "John .G" OR "Chris T."
- approved = yes (optional)
Mappings:
PUT data
{
"mappings": {
"properties": {
"createdAt": {
"type": "date",
"format": "yyyy-MM-dd HH:mm:ss.SSSZ"
},
"category": {
"type": "nested",
"properties": {
"name": {
"type": "text",
"analyzer": "keyword"
}
}
},
"approved": {
"type": "text",
"analyzer": "keyword"
}
}
}
}
Data:
POST data/_create/1
{
"category": [
{
"name": "John G.",
"level": "A"
},
{
"name": "Mary F.",
"level": "A"
}
],
"createdBy": "John",
"createdAt": "2022-04-18 19:09:27.527+0200",
"approved": "yes"
}
POST data/_create/2
{
"category": [
{
"name": "John G.",
"level": "A"
},
{
"name": "Chris T.",
"level": "A"
}
],
"createdBy": "John",
"createdAt": "2022-04-18 19:09:27.527+0200",
"approved": "no"
}
POST data/_create/3
{
"category": [
{
"name": "John G.",
"level": "C"
},
{
"name": "Phil C.",
"level": "C"
}
],
"createdBy": "John",
"createdAt": "2022-04-18 19:09:27.527+0200",
"approved": "no"
}
POST data/_create/4
{
"category": [
{
"name": "John G.",
"level": "A"
},
{
"name": "Chris T.",
"level": "A"
}
],
"createdBy": "John",
"createdAt": "2020-04-18 19:09:27.527+0200",
"approved": "yes"
}
POST data/_create/5
{
"category": [
{
"name": "Unknown A.",
"level": "A"
},
{
"name": "Unknown B.",
"level": "A"
}
],
"createdBy": "Unknown",
"createdAt": "2020-08-18 19:09:27.527+0200",
"approved": "yes"
}
Query:
GET data/_search
{
"query": {
"nested": {
"path": "category",
"query": {
"bool": {
"must": [
{"match": {"category.level": "A"}}
],
"should": [
{"term": {"category.name": "John G."}},
{"term": {"category.name": "Chris T."}},
{"term": {"approved": "yes"}}
],
"minimum_should_match": 1
}
}
}
}
}
Response:
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 3,
"relation" : "eq"
},
"max_score" : 1.4455402,
"hits" : [
{
"_index" : "data",
"_id" : "2",
"_score" : 1.4455402,
"_source" : {
"category" : [
{
"name" : "John G.",
"level" : "A"
},
{
"name" : "Chris T.",
"level" : "A"
}
],
"createdBy" : "John",
"createdAt" : "2022-04-18 19:09:27.527+0200",
"approved" : "no"
}
},
{
"_index" : "data",
"_id" : "4",
"_score" : 1.4455402,
"_source" : {
"category" : [
{
"name" : "John G.",
"level" : "A"
},
{
"name" : "Chris T.",
"level" : "A"
}
],
"createdBy" : "John",
"createdAt" : "2020-04-18 19:09:27.527+0200",
"approved" : "yes"
}
},
{
"_index" : "data",
"_id" : "1",
"_score" : 1.151647,
"_source" : {
"category" : [
{
"name" : "John G.",
"level" : "A"
},
{
"name" : "Mary F.",
"level" : "A"
}
],
"createdBy" : "John",
"createdAt" : "2022-04-18 19:09:27.527+0200",
"approved" : "yes"
}
}
]
}
}
Questions:
- Why the first document returned is an
approval = no
? I was expecting that docs withapproval = yes
would be better scored. - Why doc with index = 5 (it doesn't attend the criteria
category.name
, but it does forapproved = yes
) is not being returned? - The optionality of
approved = yes
is not being expressed in the above query. How could I create a kind of extra separatedshould
term withminimum_should_match: 0
? Something that would increase the score but would not filter the results.