I use text_general
field of Solr's provided configuration for storing content of web-pages as follows:
<fieldType name="text_general" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" />
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" />
<filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>
Field:
<field name="content" type="text_general" stored="true" indexed="true"/>
Say, in synonyms.txt
I have an entry:
ABC=>Apple Ball Company
If I perform search on content
field with q=content:ABC
On my data where I do not have any content with "Apple Ball Company
" together.
I get the highlighting-snippets for all words Apple
, Ball
and Company
in my content
containing those words not in same sequence nor even present together.
I want the highlighting only for the acronym ABC
and/or only for the expansion "Apple Ball Company
" (if these words come together in same sequence).