0

I'm new to Python and Elasticsearch and I have created an index with some data in Elasticsearch and I want to perform a query on them with Python based on some filters that are received from the user (keyword, category)

from elasticsearch import Elasticsearch
import json,requests

es = Elasticsearch(HOST="http://localhost", PORT=9200)
es = Elasticsearch()

def QueryMaker (keyword,category):
   response = es.search(index="main-news-test-data",body={"from":0,"size":10000,"query":{
       "bool": {
      "should": [
        {
          "multi_match" : {
            "query":      keyword,
            "fields":     [ "content", "title","lead" ]
          }
        },
        {
          "multi_match" : {
            "query":      category,
            "fields":     [ "category" ]
          }
        }
      ]
    }
   }})
   return(response)

def Mapper (category):
 fhandle = open('categories.txt','r', encoding="utf8")
 for line in fhandle:
     line = line.rstrip()
     items = line.split(';')
     if f'"{category}"' in items:
         category = items[0]
         return(category)

if __name__ == '__main__': 
    keyword = input('Enter Keyword: ')
    print(type(keyword))
    category = input('Enter Category: ')
    print(type(category))
    #startDate = input('Enter StartDate: ')
    #endDate = input('Enter EndDate: ')
    
    mapCategory = Mapper(category)
    if mapCategory is not None:
      mapCategory = mapCategory.replace("%","")
      data = QueryMaker(keyword,mapCategory)
      print(data)
    else:
      data = QueryMaker(keyword,mapCategory)
      print(data)

The problem is that this program only returns the matched data only if the 2 fields are full, but I want it to return data too if 1 field like category is empty. When the Keyword is empty its like ' ' and it returns nothing and when the Category is empty I receive this error:

elasticsearch.exceptions.RequestError: RequestError(400, 'x_content_parse_exception', '[multi_match] unknown token [VALUE_NULL] after [query]')   

What am I doing wrong and how can I fix my search filter?

  • **return data too if 1 field like category is empty**, regarding this, does that mean that `category` field is having the document like `' '`, and you want to query on that empty field value ? – ESCoder Sep 27 '20 at 01:45
  • did you get a chance to go through my answer, looking forward to get feedback from you :) – ESCoder Sep 27 '20 at 05:23
  • @BhavyaGupta I mean, when the category field is empty, I only want to search by key word. when keyword is empty, I want to only search by category and when both are full, I want to search and filter the data by both –  Sep 27 '20 at 06:18
  • @BhavyaGupta like, sometimes the user enters no keyword but enters a category and I want to query the data with the category only, and vice versa –  Sep 27 '20 at 06:20
  • please go through my updated answer and let me know if this was your issue ? – ESCoder Sep 27 '20 at 07:04

1 Answers1

0

Adding a working example with index data, search query, and search result

According to your comments mentioned above, if the content field does not contain keyword, and if the category field contains category, then search query will execute for category field. This can be achieved by using minimum_should_match

Index Data:

{
    "content": "keyword a",
    "title": "b",
    "lead": "c",
    "category": "d"
}
{
    "content": "a",
    "title": "b",
    "lead": "c",
    "category": "category"
}
{
    "content": "keyword a",
    "title": "b",
    "lead": "c",
    "category": "category"
}

Search Query:

{
  "query": {
    "bool": {
      "should": [
        {
          "multi_match": {
            "query": "keyword",
            "fields": [
              "content",
              "title",
              "lead"
            ]
          }
        },
        {
          "multi_match": {
            "query": "category",
            "fields": [
              "category"
            ]
          }
        }
      ],
      "minimum_should_match":1
    }
  }
}

Search Result:

"hits": [
      {
        "_index": "stof_64081587",
        "_type": "_doc",
        "_id": "3",
        "_score": 0.9666445,
        "_source": {
          "content": "keyword a",
          "title": "b",
          "lead": "c",
          "category": "category"
        }
      },
      {
        "_index": "stof_64081587",
        "_type": "_doc",
        "_id": "2",
        "_score": 0.60996956,
        "_source": {
          "content": "keyword a",
          "title": "b",
          "lead": "c",
          "category": "d"
        }
      },
      {
        "_index": "stof_64081587",
        "_type": "_doc",
        "_id": "1",
        "_score": 0.35667494,
        "_source": {
          "content": "a",
          "title": "b",
          "lead": "c",
          "category": "category"
        }
      }
    ]
ESCoder
  • 15,431
  • 2
  • 19
  • 42
  • still doesnt work, now when I put the keyword empty: –  Sep 27 '20 at 07:12
  • {'took': 0, 'timed_out': False, '_shards': {'total': 1, 'successful': 1, 'skipped': 0, 'failed': 0}, 'hits': {'total': {'value': 0, 'relation': 'eq'}, 'max_score': None, 'hits': []}} –  Sep 27 '20 at 07:12
  • and when the category is empty: –  Sep 27 '20 at 07:13
  • elasticsearch.exceptions.RequestError: RequestError(400, 'x_content_parse_exception', '[multi_match] unknown token [VALUE_NULL] after [query]') –  Sep 27 '20 at 07:13
  • @bluxixi `keyword` is empty, that means? Are you not taking `keyword` field in one of the document? – ESCoder Sep 27 '20 at 07:13
  • @bluxixi could you please tell what field value are you inserting for `category` field? – ESCoder Sep 27 '20 at 07:14
  • Enter category: (user enters category) –  Sep 27 '20 at 07:14
  • Enter Keyword: (user enters nothing) –  Sep 27 '20 at 07:15
  • in this case, we still should search by category but I receive nothing –  Sep 27 '20 at 07:15
  • @bluxixi so you mean to sayt that document inserted is this `{ "content": "keyword a", "title": "b", "lead": "c", "category": " " }` – ESCoder Sep 27 '20 at 07:16
  • no all of my data have all of the fields, but sometimes the user just wants to filter by keyword and not the category, like this is the data in my elasticsearch: {"content": "frfr","title":"rfrfrf","lead":"fffrfrf","category":15} –  Sep 27 '20 at 07:19
  • and the user says keyword:frfrf but doesnt enter the category but I still want this data to come back –  Sep 27 '20 at 07:20
  • in this case the category entered by the user maps to None based on my mapper function but I still want to search with keyword and omit the category's influence –  Sep 27 '20 at 07:21
  • same goes for when the keyword is empty and category is full –  Sep 27 '20 at 07:22
  • Thanks @bluxixi let me go through your comments and will get back to you :) – ESCoder Sep 27 '20 at 07:24
  • @bluxixi just 1 last question, against what document data are you getting this error ?`elasticsearch.exceptions.RequestError: RequestError(400, 'x_content_parse_exception', '[multi_match] unknown token [VALUE_NULL] after [query]')`. Since I need to know the document data (that is having either `keyword is empty` or `category is empty`, as far as I know in Elasticsearch for null values, I can pass `"category": null` in the document (over which I can test my search query according to your use case)). Could you please provide some sample index data, on which you are performing the search query? – ESCoder Sep 27 '20 at 08:36
  • Sure, please refer to this post (bulk data sample) : https://stackoverflow.com/questions/63897927/bulk-api-error-while-indexing-data-into-elasticsearch?noredirect=1#comment113006300_63897927 –  Sep 27 '20 at 10:18
  • the user enters category as a string and the mapper maps it to a number category and sends it –  Sep 27 '20 at 10:19