28

Update: I think this question has to do with solr syntax in general, and not Chef in particular. So while I ran into this working with Chef, I presume that anyone working with Solr will also experience this...


I'm working on an application that communicates with the Chef server's search API to find particular nodes.

Based on this http://docs.opscode.com/essentials_search.html#special-characters, it seems that a number of special characters need to be escaped.

Note: I'm only concerned with exact-matching patterns, not wildcards. I realize that the reason some of these characters are wildcards.

Here's the list at the time of this writing, as copied from the URL above:

+  -  &&  | |  !  ( )  { }  [ ]  ^  "  ~  *  ?  :  \

When I try various knife search commands with these characters, however, I see inconsistent behaviour.

For the following examples, I set up a node that is tagged with +&|!(){}[]^\"~*?:\\"

These commands were run from a Linux box, in a bash shell:

$ knife search node 'tags:+&|!(){}[]^"~*?:\'
ERROR: knife search failed: invalid search query: 'tags:+&|!(){}[]^"~*?:\'

That behaved as expected, since nothing was escaped. Now, I escape everything with a single \ as the docs suggest:

$ knife search node 'tags:\+\&\|\!\(\)\{\}\[\]\^\"\~\*\?\:\\'
ERROR: knife search failed: invalid search query: 'tags:\+\&\|\!\(\)\{\}\[\]\^\"\~\*\?\:\\'

Strange.

Can anyone shed some light on this, and maybe suggest a query that's capable of matching that tag?

It's obviously unlikely that anyone will ever have an attribute containing all those special characters, but I'd like to understand better how the special characters should be escaped.

Thanks!

hairyhenderson
  • 577
  • 1
  • 7
  • 20
  • Maybe you find more information when searching for the same but for solr instead of chef..? That's what used for search. – StephenKing Feb 20 '14 at 19:48
  • `! ( ) { } [ ] ^ " ~ * ? : \ ` Those all work for me but `+ - && | |` all fail – Display Name is missing Feb 20 '14 at 21:47
  • @better_use_mkstemp: thanks. That partially helps. I'm also a little confused why `&&` and `||` are considered special _characters_. – hairyhenderson Feb 21 '14 at 14:01
  • After reading the URL posted by @sethvargo below, I now understand why +, -, &&, and || are interpreted specially. They're considered boolean operators. However it's still not clear how to properly escape these. – hairyhenderson Feb 21 '14 at 14:07

2 Answers2

19

You need to use the lucene solr syntax for regexes: http://lucene.apache.org/core/6_5_1/queryparser/org/apache/lucene/queryparser/classic/package-summary.html#Escaping_Special_Characters

miku
  • 181,842
  • 47
  • 306
  • 310
sethvargo
  • 26,739
  • 10
  • 86
  • 156
  • thanks. It looks like the Chef docs simply copied the Lucene docs at this URL: http://lucene.apache.org/core/2_9_4/queryparsersyntax.html#Escaping Special Characters , which isn't any more helpful... – hairyhenderson Feb 21 '14 at 13:57
12

It might be a good idea looking at http://lucene.apache.org/solr/4_2_1/solr-solrj/org/apache/solr/client/solrj/util/ClientUtils.html#escapeQueryChars(java.lang.String)

Anatoli Radulov
  • 556
  • 6
  • 12