I found that the Elasticsearch query doesn't accept a lot of characters due its regular expression processment, and this is messing me up.
On documentation:
Reserved charactersedit If you need to use any of the characters which function as operators in your query itself (and not as operators), then you should escape them with a leading backslash. For instance, to search for (1+1)=2, you would need to write your query as (1+1)\=2.
The reserved characters are: + - = && || > < ! ( ) { } [ ] ^ " ~ * ? : \ /
Failing to escape these special characters correctly could lead to a syntax error which prevents your query from running.
I already tried to avoid this error but by pre-processing the test on Python but still didn't solve. I tried the following:
def strip_accents(s):
return ''.join(c for c in unicodedata.normalize('NFD', s)
if unicodedata.category(c) != 'Mn')
::
eval_file_content = eval_file_content.replace('"', '\\"')
eval_file_content = eval_file_content.replace("'", "\\'")
eval_file_content = eval_file_content.replace("-", "\-")
eval_file_content = eval_file_content.replace("+", "\+")
eval_file_content = eval_file_content.replace("&&", "\&&")
eval_file_content = eval_file_content.replace("||", "\||")
eval_file_content = eval_file_content.replace("<", "\<")
eval_file_content = eval_file_content.replace(">", "\>")
eval_file_content = eval_file_content.replace("!", "\!")
eval_file_content = eval_file_content.replace("(", "\(")
eval_file_content = eval_file_content.replace(")", "\)")
eval_file_content = eval_file_content.replace("{", "\{")
eval_file_content = eval_file_content.replace("}", "\}")
eval_file_content = eval_file_content.replace("[", "\[")
eval_file_content = eval_file_content.replace("]", "\]")
eval_file_content = eval_file_content.replace("^", "\^")
eval_file_content = eval_file_content.replace("~", "\~")
eval_file_content = eval_file_content.replace("*", "\*")
eval_file_content = eval_file_content.replace("?", "\?")
eval_file_content = eval_file_content.replace(":", "\:")
eval_file_content = eval_file_content.replace("\\", "\\\\")
eval_file_content = eval_file_content.replace("/", "\/")
eval_file_content = strip_accents(eval_file_content)
eval_file_content = eval_file_content.encode("ascii", errors="ignore").decode()
How can I solve this? Here's an example of "before and after" pre-processing:
For the query, I'm using the following method:
searchQuery = "http://localhost:9200/metis/ER/_search?q='" + eval_file_content + "'"
res = requests.get(searchQuery).content
The request error:
b'{"error":{"root_cause":[{"type":"query_shard_exception","reason":"Failed to parse query [\'the pmNrOfIpTermsRej of VMGW increasing \\n1.the VMGW1_LFGS8 of LFGM3 pmNrOfIpT'