1

I am having a very strange behavior with EdgeNgramField in Haystack with Elasticsearch.

Basically. If I have a BooleanField in my index, all Ngram and EdgeNgram fields work just like regular CharField.. anyone knows about this?

Here is my index (no boolean field, ngram search works just fine):

class UserIndex(indexes.SearchIndex, indexes.Indexable):
  text = indexes.CharField(document=True)
  name = indexes.EdgeNgramField(model_attr='first_name')

  def get_model(self):
    return User

  def index_queryset(self, using=None):
    # Filter the user - return only the one with a UserProfile
    return self.get_model().objects.all().annotate(pcount=models.Count('userprofile')).filter(pcount__gt=0)

Here is the result of a search (looks good):

>>> for item in sqs.autocomplete(name='sebastien'):
...   print item.name
... 
Sebastien
Sebastien
sebastien
Sebastian
Sebastian
Sebastian
Juan Sebastián
sebtest
sebtest2
sebtest
Seetesh 
Serena
Selene
Severine
Severine
Sergio

Then I add a BooleanField:

class UserIndex(indexes.SearchIndex, indexes.Indexable):
  text = indexes.CharField(document=True)
  name = indexes.EdgeNgramField(model_attr='first_name')
  is_active = indexes.BooleanField(model_attr='is_active')

  def get_model(self):
    return User

  def index_queryset(self, using=None):
    # Filter the user - return only the one with a UserProfile
    return self.get_model().objects.all().annotate(pcount=models.Count('userprofile')).filter(pcount__gt=0)

And for some unknown reason, the EdgeNgramField is now acting like a CharField (the re-indexing is performed without any error)

>>> for item in sqs.autocomplete(name='sebastien'):
...   print item.name
... 
Sebastien
Sebastien
sebastien
sclaeys
  • 203
  • 2
  • 6
  • Github issue with some more info about this: https://github.com/toastdriven/django-haystack/issues/1028 – sclaeys Jul 10 '14 at 14:42

1 Answers1

0

I had several issues with Haystack + Elasticsearch, the most important described here, however it wasn't the only one. Among others I had the same situation that you mention.

I fixed the issue (however I think it is a haystack bug) adding "indexed=False" to the boolean field:

is_active = indexes.BooleanField(indexed=False, stored=True, model_attr='is_active')

You have to check your index mapping and make sure your fields are defined properly

http://127.0.0.1:9200/{your_index}/_mapping

...
name: {
    type: "string",
    store: true,
    term_vector: "with_positions_offsets",
    index_analyzer: "edgengram_analyzer",
    search_analyzer: "analyzer"
},
is_active: {
    type: "boolean",
    store: true
},
...

IMPORTANT: You have to re-create your index in order to see the change in the mapping and therefore in your search.

Community
  • 1
  • 1
tufla
  • 562
  • 6
  • 16
  • Thanks for the answer. however, I guess this wont't work if you want to actually searcg by is_active value. – sclaeys Jul 10 '14 at 14:27