0

I came upon this strange behavior in Python.

I know being an OOP language, Python is pass by reference.

Can someone describe the issue briefly?

I have a for loop that iterates over the list of strings, and based on the condition it modifies and appends the object to the new list, which is dynamically generated of course.

search_body = {'from': 1, 'size': 10}
multi_search = {'multi_match': {'query': '', 'fields': ['contents^3', 'summary']}}

queries_strings = ['foo', 'bar', '!troy', '!trip']

if len(queries_strings) > 1:
    bool_query = {'bool': {}}
    for string in map(str.strip, queries_strings):
        multi_search['multi_match']['query'] = string.lstrip('!').strip()
        if string.startswith('!'):
            try:
                bool_query['bool']['must_not'].append(multi_search)
            except KeyError:
                bool_query['bool']['must_not'] = [multi_search]
            continue
        try:
            bool_query['bool']['must'].append(multi_search)
        except KeyError:
            bool_query['bool']['must'] = [multi_search]
    search_body['query'] = bool_query

print(search_body)

As you can imagine expected behavior would be:

{
   "from":1,
   "size":10,
   "query":{
      "bool":{
         "must":[
            {
               "multi_match":{
                  "query":"foo",
                  "fields":[
                     "contents^3",
                     "summary"
                  ]
               }
            },
            {
               "multi_match":{
                  "query":"bar",
                  "fields":[
                     "contents^3",
                     "summary"
                  ]
               }
            }
         ],
         "must_not":[
            {
               "multi_match":{
                  "query":"troy",
                  "fields":[
                     "contents^3",
                     "summary"
                  ]
               }
            },
            {
               "multi_match":{
                  "query":"trip",
                  "fields":[
                     "contents^3",
                     "summary"
                  ]
               }
            }
         ]
      }
   }
}

But it's not: the key query will be replaced with last iterated value in every occurrence, which is obvious because Python is pass by references, but remember we have copy of each object. Kind of illusion with dynamic programming, compared to the static programming language, isn't it?

{
   "from":1,
   "size":10,
   "query":{
      "bool":{
         "must":[
            {
               "multi_match":{
                  "query":"trip",
                  "fields":[
                     "contents^3",
                     "summary"
                  ]
               }
            },
            {
               "multi_match":{
                  "query":"trip",
                  "fields":[
                     "contents^3",
                     "summary"
                  ]
               }
            }
         ],
         "must_not":[
            {
               "multi_match":{
                  "query":"trip",
                  "fields":[
                     "contents^3",
                     "summary"
                  ]
               }
            },
            {
               "multi_match":{
                  "query":"trip",
                  "fields":[
                     "contents^3",
                     "summary"
                  ]
               }
            }
         ]
      }
   }
}

Which is can be easily fixed just by creating object instead of referencing object on line 9.

multi_search = {'multi_match': {'query': string.lstrip('!').strip(), 'fields': ['contents^3', 'summary']}}
Debendra
  • 1,132
  • 11
  • 22

0 Answers0