0

I'm playing around with elastic search and I got a weird problem: I've got a more like this request that I build up in several ways:

curl -XGET 'http://127.0.0.1:9200/train-recipe/_search' -d '{
    'query': {
        'more_like_this': {
            'fields': ['ingredients'],
            'max_query_terms': 12,
            'like': [{'_type': 'recipe', '_id': 2938, '_index': 'train-recipe'}],
            'min_term_freq': 1
        }
    }, 
    'from': 0, 
    'size': 10
}'

And I get the following response:

{"error":{"root_cause":[{"type":"json_parse_exception","reason":"json_parse_exception: Unrecognized token 'ingredients': was expecting ('true', 'false' or 'null')\n at [Source: [B@6f7a6ea4; line: 4, column: 34]"}],"type":"search_phase_execution_exception","reason":"all shards failed","phase":"query","grouped":true,"failed_shards":[{"shard":0,"index":"train-recipe","node":"kfORe_NWSE2gIeHSHGgIQw","reason":{"type":"query_parsing_exception","reason":"Failed to parse","index":"train-recipe","caused_by":{"type":"json_parse_exception","reason":"json_parse_exception: Unrecognized token 'ingredients': was expecting ('true', 'false' or 'null')\n at [Source: [B@6f7a6ea4; line: 4, column: 34]"}}}]},"status":400}

I also have this request, which for me is identical to the first one:

curl -XGET 'http://127.0.0.1:9200/train-recipe/_search' -d '{
  "query": {
    "more_like_this": {
      "fields": ["ingredients"],
      "like": [{"_index" : "train-recipe","_type" : "recipe","_id" : 2938}],
      "min_term_freq": 1,
      "max_query_terms": 12
    }
  },
    'from' : 0,
    'size':10
}'

But this one works perfectly fine. And I also try to do it using python requests as follows:

def build_mlt(nb, doc_id):
   mlt = {}
   mlt['from'] = 0
   mlt['size'] = nb
   mlt['query'] = {}
   mlt['query']['more_like_this'] = {}
   mlt['query']['more_like_this']['fields'] = ['ingredients']
   mlt['query']['more_like_this']['like'] = [{"_index" : "train-recipe","_type" : "recipe","_id" : doc_id}]
   mlt['query']['more_like_this']['min_term_freq'] = 1
   mlt['query']['more_like_this']['max_query_terms'] = 10
   return mlt

def get_similar(nb, doc_id):
   mlt = build_mlt(10, 2938)
   response = requests.get("http://localhost:9200/test-recipe/recipe/_search", data=json.dumps(mlt))
   print json.loads(response.text)

And this time I have another response:

{u'hits': {u'hits': [], u'total': 0, u'max_score': None}, u'_shards': {u'successful': 5, u'failed': 0, u'total': 5}, u'took': 2, u'timed_out': False}

For me the three requests are identical. I did the second one based on the dictionary generated by my function melt_builder. Can someone explain to me why I get three different responses?

glls
  • 2,325
  • 1
  • 22
  • 39
mel
  • 2,730
  • 8
  • 35
  • 70

2 Answers2

2

In the first case, there must be an issue with single quotes. You have single quotes in your JSON and also around to the JSON in order to pass the payload to the -d parameter.

In the second case, you're using double quotes, so you're fine.

In the third case, you should send your request using requests.post() otherwise the payload with the query doesn't get sent.

Val
  • 207,596
  • 13
  • 358
  • 360
  • I modify my request method to post and I still got the same answer an empty array. – mel May 23 '16 at 08:46
  • You should query the index called `train-recipe` not `test-recipe` – Val May 23 '16 at 08:48
  • In fact I work on a kaggle and in train-recipe there is my training set and in test-recipe there is the recipe that I have to predict. train-recipe and test-recipe contain exactly the same documents except that in train-recipe there is one more field (the field I have to predict) and It works for a lot of recipe but not this one – mel May 23 '16 at 08:54
  • Machine learning, gotcha. Still, in your like specification you are using `train-recipe` but then are sending your query to `test-recipe`, while in the two above cases you were sending the query to `train-recipe` – Val May 23 '16 at 08:56
  • I also tried to do the same request on curl and it works but using python I get an empty array – mel May 23 '16 at 08:58
  • Again, in the first two cases you send the request to `train-recipe` and the like is made on `train-recipe` as well, while in the third case, the like is on `train-recipe` and you send the query to `test-recipe` so that's not at all equivalent. – Val May 23 '16 at 08:59
  • oh ok sorry I got your point I lost myself in all the indexes ty – mel May 23 '16 at 09:02
2

As said here : python: single vs double quotes in JSON you need doubles quotes in json.

Third case was explained by Val.

Community
  • 1
  • 1
julia-nf
  • 243
  • 1
  • 8
  • I modify my request method to post and I change all the single quote by a double quote but I still got the same answer an empty array. – mel May 23 '16 at 08:56