22

I am new to ElasticSearch, but need to use it to return a list of products. Please do not include answers or links to old answers which reference the deprecated tire gem.

gemfile

ruby '2.2.0'
gem 'rails', '4.0.3'
gem 'elasticsearch-model', '~> 0.1.6'
gem 'elasticsearch-rails', '~> 0.1.6'

I have a couple models with relationships. I included the relationships below.

Models and Relationships

product.rb include Searchable

  belongs_to :family
  belongs_to :collection
  has_many :benefits_products
  has_many :benefits, :through => :benefits_products

  def as_indexed_json(options={})
    as_json(
        include: {:benefits => { :only => [ :id, :name ] },
                  :categories => { :only => [ :id, :name ] } }
    )
  end

collection.rb

  include Searchable

  has_many :products

  def as_indexed_json(options={})
    as_json(
      include: [:products]
    )
  end

family.rb

  include Searchable

  has_many :products

  def as_indexed_json(options={})
    as_json(
      include: [:products]
    )
  end

benefit.rb

  include Searchable

  has_many :benefits_products
  has_many :products, :through => :benefits_products

  def as_indexed_json(options={})
    as_json(
      include: [:products]
    )
  end

Serachable.rb Is just a concern that includes Elastic search and callbacks in all models

module Searchable
  extend ActiveSupport::Concern

  included do
    include Elasticsearch::Model
    include Elasticsearch::Model::Callbacks

    settings index: { number_of_shards: 1, number_of_replicas: 0 } do
      mapping do

        indexes :id, type: 'long'
        indexes :name, type: 'string'
        indexes :family_id, type: 'long'
        indexes :collection_id, type: 'long'
        indexes :created_at, type: 'date'
        indexes :updated_at, type: 'date'

        indexes :benefits, type: 'nested' do
          indexes :id, type: 'long'
          indexes :name, type: 'string'
        end

        indexes :categories, type: 'nested' do
          indexes :id, type: 'long'
          indexes :name, type: 'string'
        end

      end
    end

    def self.search(options={})
      __set_filters = lambda do |key, f|

        @search_definition[:filter][:and] ||= []
        @search_definition[:filter][:and]  |= [f]
      end

      @search_definition = {
        query: {
          filtered: {
            query: {
              match_all: {}
            }
          }
        },
        filter: {}
      }

      if options[:benefits]
        f = { term: { "benefits.id": options[:benefits] } }

        __set_filters.(:collection_id, f)
        __set_filters.(:family_id, f)
        __set_filters.(:categories, f)
      end

      def as_indexed_json(options={})
        as_json(
          include: {:benefits => { :only => [ :id, :name ] },
                    :categories => { :only => [ :id, :name ] } }
        )
      end

      if options[:categories]
        ...
      end

      if options[:collection_id]
        ...
      end

      if options[:family_id]
        ...
      end

      __elasticsearch__.search(@search_definition)
    end

  end
end

ElasticSearch

I breakdown dash separated slugs into the various families, collections and benefits. I am able to search for products with a specific family or collection and return correct results. I am also able to return results for one benefit, but they don't appear to be accurate. Also searching multiple benefits yields strange results. I would like the "AND" combination of all fields search, but my result doesnt seem to be the result of "AND" or "OR". So this is confusing me as well.

What do I pass to the Product.search method to yield desired results?

Thanks for any help you can provide!

Edit

I have now verified that benefits are indexed on the products. I used curl -XGET 'http://127.0.0.1:9200/products/_search?pretty=1' which produced a json response that looked like this:

{
  "id":4,
  "name":"product name"
  "family_id":16
  "collection_id":6
  "created_at":"2015-04-13T12:49:42.000Z"
  "updated_at":"2015-04-13T12:49:42.000Z"
  "benefits":[
    {"id":2,"name":"my benefit 2"},
    {"id":6,"name":"my benefit 6"},
    {"id":7,"name":"my benefit 7"}
  ],
  "categories":[
    {"id":2,"name":"category 2"}
  ]}
},
{...}

Now I just need to figure out how to search for the product with benefits 2,6, AND 7 in ElasticSearch if I wanted the above example product. I am specifically looking for the syntax to submit to the elasticsearch #search method to acquire the results of a nested "AND" query, nested query setup/mappings (to make sure I have not missed anything, and any other relevant info you can think of you troubleshoot this.

Upated

The Searchable concern has been updated to reflect the answer received. I translated the mapping json object to fit in the elasticsearch-model syntax. My remaining confusion occurs when I attempt to translate the query in a similar fashion.

Second Update

I am basic most of my searchable.rb concern off the elasticsearch-rails example app. I have updated searchable.rb to reflect this code, and while I am getting results, they are not the result of an "AND" execution. When I apply two benefits, I get the results from all products that have either benefit.

Thomas
  • 2,426
  • 3
  • 23
  • 38

1 Answers1

4

By default if you use dynamic mapping to load the data, then ES will create nested objects as flat objects and hence will loose the relation between the various nested properties. To maintain the proper relations we can use either nested objects or parent-child relations.

Now i will use nested objects to achieve the desired result:

Mapping:

PUT /index-3
{
  "mappings": {
    "products":{
      "properties": {
        "id": {
          "type": "long"
        },
        "name":{
          "type": "string"
        },
        "family_id":{
          "type": "long"
        },
        "collection_id":{
          "type": "long"
        },
        "created_at":{
          "type": "date"
        },
        "updated_at":{
          "type": "date"
        },
        "benefits":{
          "type": "nested",
          "include_in_parent": true,
          "properties": {
            "id": {
              "type": "long"
            },
            "name":{
              "type":"string"
            }
          }
        },
        "categories":{
          "type": "nested",
          "include_in_parent": true,
          "properties": {
            "id":{
              "type": "long"
            },
            "name":{
              "type":"string"
            }
          }
        }
      }
    }
  }
}

If you observe i have treated the children objects as nested mapping and included in the parent.

Now some sample data:

PUT /index-3/products/4
{
  "name":"product name 4",
  "family_id":15,
  "collection_id":6,
  "created_at":"2015-04-13T12:49:42.000Z",
  "updated_at":"2015-04-13T12:49:42.000Z",
  "benefits":[
    {"id":2,"name":"my benefit 2"},
    {"id":6,"name":"my benefit 6"},
    {"id":7,"name":"my benefit 7"}
  ],
  "categories":[
    {"id":2,"name":"category 2"}
  ]
}
PUT /index-3/products/5
{
  "name":"product name 5",
  "family_id":16,
  "collection_id":6,
  "created_at":"2015-04-13T12:49:42.000Z",
  "updated_at":"2015-04-13T12:49:42.000Z",
  "benefits":[
    {"id":5,"name":"my benefit 2"},
    {"id":6,"name":"my benefit 6"},
    {"id":7,"name":"my benefit 7"}
  ],
  "categories":[
    {"id":3,"name":"category 2"}
  ]
}
PUT /index-3/products/6
{
  "name":"product name 6",
  "family_id":15,
  "collection_id":5,
  "created_at":"2015-04-13T12:49:42.000Z",
  "updated_at":"2015-04-13T12:49:42.000Z",
  "benefits":[
    {"id":5,"name":"my benefit 2"},
    {"id":55,"name":"my benefit 6"},
    {"id":7,"name":"my benefit 7"}
  ],
  "categories":[
    {"id":3,"name":"category 2"}
  ]
}

And now the query part:

GET index-3/products/_search
{
  "query": {
    "filtered": {
      "query": {
        "match_all": {}
      },
      "filter": {
        "terms": {
          "benefits.id": [
            5,6,7
          ],
          "execution": "and"
        }
      }
    }
  }
}

Which produces the following result:

{
   "took": 1,
   "timed_out": false,
   "_shards": {
      "total": 1,
      "successful": 1,
      "failed": 0
   },
   "hits": {
      "total": 1,
      "max_score": 1,
      "hits": [
         {
            "_index": "index-3",
            "_type": "products",
            "_id": "5",
            "_score": 1,
            "_source": {
               "name": "product name 5",
               "family_id": 16,
               "collection_id": 6,
               "created_at": "2015-04-13T12:49:42.000Z",
               "updated_at": "2015-04-13T12:49:42.000Z",
               "benefits": [
                  {
                     "id": 5,
                     "name": "my benefit 2"
                  },
                  {
                     "id": 6,
                     "name": "my benefit 6"
                  },
                  {
                     "id": 7,
                     "name": "my benefit 7"
                  }
               ],
               "categories": [
                  {
                     "id": 3,
                     "name": "category 2"
                  }
               ]
            }
         }
      ]
   }
}

At the time of query we have to use terms filter with "and execution" so it will retrieve only the documents with all the terms.

monu
  • 698
  • 4
  • 10
  • Thanks for your response. I have updated my Searchable.rb to reflect the indexing. I am still a bit confused about how to translate the query to fit in the with elastic search-rails dsl as shown here https://github.com/elastic/elasticsearch-rails/tree/master/elasticsearch-model – Thomas Apr 20 '15 at 15:08
  • Sorry for my mistake in the question, I have updated it to be more accurate. I have also updated searchable.rb to reflect my current progress. – Thomas Apr 22 '15 at 01:15
  • 1
    I gave you the bounty as I know your answer is correct. The bounty was expiring and I didn't want your effort to be wasted due to my poorly worded question. If you have any insights on the most current update I would love you hear it. If not, I still appreciate your help and the progress I have made as a result. Thanks! – Thomas Apr 22 '15 at 21:28