2

Suppose i've 3 doc

doc_1 = {
    "citedIn": [
        "Bar Councils Act, 1926 - Section 15",
        "Contract Act, 1872 - Section 23"
    ]
}

doc_2 = {
    "citedIn":[
        "15 C. B 400", 
        "Contract Act, 1872 - Section 55"
    ]
}

doc_3 = {
    "citedIn":[
        "15 C. B 400", 
        "Contract Act, 1872 - Section 15"
    ]
}

Here citedIn field is a array object.Now i want run a stander match query

{
    "query":
    {
        "match": {"citedIn":{"query": "Contract act 15" , "operator":"and" }}
    }

}

The above query return all of the 3 doc, but it suppose to return doc_3 as only doc_3 contain Contract, act and 15 together in a single array element .

How would i achieve this ?

Any suggestion/Solution would be preferable

Nested Data Type Update :

i did try nested field. This Is my mapping

{
    "mappings": {
        "properties": {
            "citedIn": {
                "type": "nested",
                "include_in_parent": true,
                "properties": {
                    "someFiled": {
                        "type": "text",
                        "fields": {
                            "keyword": {
                                "type": "keyword",
                                "ignore_above": 256
                            }
                        }
                    }
                }
            }
        }
    }
}

This is my data

doc_1 = {
    "citedIn": [
        {"someFiled" : "Bar Councils Act, 1926 - Section 15"},
        {"someFiled" : "Contract Act, 1872 - Section 23"}
    ]
}

doc_2 = {
    "citedIn":[
        {"someFiled" : "15 C. B 400"}
        {"someFiled" : "Contract Act, 1872 - Section 55"}
    ]
}

doc_3 = {
    "citedIn":[
        {"someFiled" : "15 C. B 400"},
        {"someFiled" : "Contract Act, 1872 - Section 15"}
    ]
}

This is my query

{
    "query":
    {

        "match": {"citedIn.someFiled":{"query": "Contract act 15" , "operator":"and" }}
            
        
    }
}

But still getting same result

Amit
  • 30,756
  • 6
  • 57
  • 88
Aninda
  • 97
  • 5
  • 14

2 Answers2

4

Adding a working example with index data, mapping,search query, and search result.

You need to use nested query to search on nested fields

Index Mapping

{
    "mappings": {
        "properties": {
            "citedIn": {
                "type": "nested"
            }
        }
    }
}

Index Data:

 {
        "citedIn": [
            {
                "someFiled": "Bar Councils Act, 1926 - Section 15"
            },
            {
                "someFiled": "Contract Act, 1872 - Section 23"
            }
        ]
    }
    {
        "citedIn": [
            {
                "someFiled": "15 C. B 400"
            },
            {
                "someFiled": "Contract Act, 1872 - Section 55"
            }
        ]
    }
    {
        "citedIn": [
            {
                "someFiled": "15 C. B 400"
            },
            {
                "someFiled": "Contract Act, 1872 - Section 15"
            }
        ]
    }

Search Query:

{
    "query": {
        "nested": {
            "path": "citedIn",
            "query": {
                "bool": {
                    "must": [
                        {
                            "match": {
                                "citedIn.someFiled": "contract"
                            }
                        },
                        {
                            "match": {
                                "citedIn.someFiled": "act"
                            }
                        },
                        {
                            "match": {
                                "citedIn.someFiled": 15
                            }
                        }
                    ]
                }
            },
            "inner_hits": {}
        }
    }
}

Search Result:

"inner_hits": {
          "citedIn": {
            "hits": {
              "total": {
                "value": 1,
                "relation": "eq"
              },
              "max_score": 1.620718,
              "hits": [
                {
                  "_index": "stof_64170705",
                  "_type": "_doc",
                  "_id": "3",
                  "_nested": {
                    "field": "citedIn",
                    "offset": 1
                  },
                  "_score": 1.620718,
                  "_source": {
                    "someFiled": "Contract Act, 1872 - Section 15"
                  }
                }
              ]
            }
          }
        }
      }
ESCoder
  • 15,431
  • 2
  • 19
  • 42
  • @Aninda please go through my answer, and let me know if this was your issue ? – ESCoder Oct 03 '20 at 01:53
  • 1
    thank you very much, it solve the issue ... problem was with my nested search query , i thought if i use `include_in_parent == True` i won't have to specify the path .. Btw, do i really need to use `N` match query for `N` number of words ? I get the same result with this query `{ "query": { "nested": { "path": "citedIn", "query": { "match": {"citedIn.someFiled": {"query": "contract act 15", "operator":"and"} } }, "inner_hits": {} } } }` – Aninda Oct 03 '20 at 06:43
  • @Aninda glad I could help you :) You can use the query which you mentioned in the comment above, it will work perfectly (As `must` is equivalent to `AND` operator) :) – ESCoder Oct 03 '20 at 06:56
  • @Aninda This is really bad :|, May I know the reason for unaccepting my answer? You wanted a nested search query, which I have provided, and it's working exactly to your use case. And above all, you have accepted my answer yesterday because it solved your issue (and now you have just unaccepted it, its weird ) – ESCoder Oct 04 '20 at 08:38
  • i'm extremely sorry .... i didn't know that i can't except multiple answer . i'm really really sorry for that – Aninda Oct 04 '20 at 10:38
  • @Aninda it's okay thank u for accepting my answer – ESCoder Oct 04 '20 at 11:20
1

There is no way for you to achieve this as what you are indexing is a array to strings in your citedIn field, and as all Elasticsearch fields are multi-valued by default as it was designed in Lucene that way, and elasticsearch is built on top of Lucene search library.

Please read arrays in elasticsearch for more info, especially the last important note as shown in below image:

enter image description here

As explained in above image, all your strings in your array actually part of same field, hence there is no way for ES to identify whether your search string was part of same string in array or not, due to which you were getting all the docs in search.

Unless you index these strings as part of another fields like nested fields, but for that you need to give the name of fields and its like a map where key is your field name and value is field value and than you query on field names, you wouldn't be able to achieve your use-case.

Amit
  • 30,756
  • 6
  • 57
  • 88
  • thanks for the comment and i did try nested field but getting same result . i've edited my post – Aninda Oct 02 '20 at 12:06
  • i didn't know that i cant except multiple answer . as Bhavya gave me exact solution, it would be unfair if i don't except his answer ... Again i'm really really sorry for my noob behavior – Aninda Oct 04 '20 at 10:44