1

Solr 4.9.1 (can't update as this is a Silverstripe plugin). The issue is on the frontend, but the following is straight out of the Solr query panel. I'm rather new to Solr and so far I have been digging into suggestions on Tokenizers and Filters (but can't make sense of those in the context of this issue), and escaping (which doesn't seem to do anything)

Here is my example with debug output:

Field Value in existing doc: Across the world - Fly/Sail

Query (frontend): Fly/Sail

Search results: 0

Debug Output:

"rawquerystring": "Fly/Sail",
"querystring": "Fly/Sail",
"parsedquery": "PhraseQuery(_text:\"fly sail fly sail\")",
"parsedquery_toString": "_text:\"fly sail fly sail\"",
"explain": {},
"QParser": "LuceneQParser"

Most confusing for me looking at this is why the double up in parsed query? Escaping the forward slash with a backslash doesn't change anything.

If I search for "Fly Sail", the expected results appear.

Edit: My configuration:

<fields>
<field name='_documentid' type='string' indexed='true' stored='true' required='true' />
<field name='ID' type='tint' indexed='true' stored='true' required='true' />
<field name='_text' type='htmltext' indexed='true' stored='true' multiValued='true' />
<field name='VivaTour_TourName' type='text' indexed='true' stored='true' multiValued=''/>
<field name='VivaTour_TourDescription' type='htmltext' indexed='true' stored='true' multiValued=''/>

Edit 2: Screenshot of my Analysis page for this search

https://i.stack.imgur.com/XAYoo.jpg

Abhijit Bashetti
  • 8,518
  • 7
  • 35
  • 47
Aaryn
  • 1,601
  • 2
  • 18
  • 31
  • 1
    What is the field type definition (and have you reindexed after changing it)? What does the Analysis page under Solr Admin tell you about how the fields are being processed? This should give you a detailed view of both how your query and indexed content is being parsed. – MatsLindh Jul 21 '19 at 10:58
  • Hi @MatsLindh. I have updated my question with the field part of my Schema. Field in question is "VivaTour_TourName". Also a dump of my Analysis page. I've been trying to make sense of it and reading this, but I don't follow it. https://lucene.apache.org/solr/guide/6_6/analysis-screen.html – Aaryn Jul 22 '19 at 04:42
  • 1
    The `KRF` step in your query is the one duplicating the input token. I'm not sure which filter that is from the description, but you can hover over `KRF` to see which filter it is. It'd also be helpful if you include the definition of the `htmltext` field type. The screenshot from the analysis page shows the `VivaTour_TourName` field, but you're searching against the `_text` field - so that would be more relevant. The Analysis page shows you exactly what happens with the query or indexed text for each step in your filter chain. – MatsLindh Jul 22 '19 at 08:27
  • @MatsLindh thanks. It was a mixture of removing KRF as you picked up and the answer below. – Aaryn Aug 12 '19 at 04:46

1 Answers1

1

Try the below fieldType for your field "VivaTour_TourName".

<fieldType name="text_wd" class="solr.TextField" positionIncrementGap="100">
    <analyzer type="index">
          <!-- Splits words based on whitespace characters --> 
          <tokenizer class="solr.WhitespaceTokenizerFactory"/>
          <!-- splits words at delimiters based on different arguments --> 
          <filter class="solr.WordDelimiterGraphFilterFactory" preserveOriginal="1" catenateWords="1"/>
          <!-- Transforms text to lower case -->   
          <filter class="solr.LowerCaseFilterFactory"/>
        </analyzer>

        <analyzer type="query">
          <tokenizer class="solr.WhitespaceTokenizerFactory"/>
          <filter class="solr.LowerCaseFilterFactory"/>
        </analyzer>
  </fieldType>

Once you modify the schema.xml, please restart the server and re-index the data.

Please refer the screenshots for your reference.

solr analysis screen 1

solr analysis screen 2

Abhijit Bashetti
  • 8,518
  • 7
  • 35
  • 47