-1

I'm using Spring Boot 2.0.5, Spring Data Elasticsearch 3.1.0 and Elasticsearch 6.4.2

I have loaded ElasticSearch with a set of articles. For each article, I have a keywords field with a string list of keywords e.g.

"keywords": ["Football", "Barcelona", "Cristiano Ronaldo", "Real Madrid", "Zinedine Zidane"],

For each user using the application, they can specify their keyword preferences with a weight factor.

e.g.

User 1:
    keyword: Football, weight:3.0
    keyword: Tech, weight:1.0 
    keyword: Health, weight:2.0 

What I would like to do is find articles based on their keyword preferences and display them based on their weight factor preference (I think this relates to elastic search boost) and sort by latest article time.

This is what I have so far (only for one keyword):

 public Page<Article> getArticles(String keyword, float boost, Pageable pageable) {

        SearchQuery searchQuery = new NativeSearchQueryBuilder()
        .withQuery(QueryBuilders.matchQuery("keywords", keyword).boost(boost))
        .build();
        return articleRepository.search(searchQuery);
 }

As a user may have n number of keyword preferences, what would I need to change in the above code to support this?

Any suggestions would be highly appreciated.

Solution

OK I enabled logging so I can could see the elastic search query being produced. Then I updated the getArticles method to the following:

public Page<Article> getArticles(List<Keyword> keywords, Pageable pageable) {

    BoolQueryBuilder queryBuilder = QueryBuilders.boolQuery();
    List<FilterFunctionBuilder> functions = new ArrayList<FilterFunctionBuilder>();

    for (Keyword keyword : keywords) {
        queryBuilder.should(QueryBuilders.termsQuery("keywords", keyword.getKeyword()));
        functions.add(new FunctionScoreQueryBuilder.FilterFunctionBuilder(
                QueryBuilders.termQuery("keywords", keyword.getKeyword()),
                ScoreFunctionBuilders.weightFactorFunction(keyword.getWeight())));
    }
    FunctionScoreQueryBuilder functionScoreQueryBuilder = QueryBuilders.functionScoreQuery(queryBuilder,
            functions.toArray(new FunctionScoreQueryBuilder.FilterFunctionBuilder[functions.size()]));

    NativeSearchQueryBuilder searchQuery = new NativeSearchQueryBuilder();
    searchQuery.withQuery(functionScoreQueryBuilder);
    searchQuery.withPageable(pageable); 
    // searchQuery.withSort(SortBuilders.fieldSort("createdDate").order(SortOrder.DESC));
    return articleRepository.search(searchQuery.build());
}

This produces the following elastic search query:

{
  "from" : 0,
  "size" : 20,
  "query" : {
    "function_score" : {
      "query" : {
        "bool" : {
          "should" : [
            {
              "terms" : {
                "keywords" : [
                  "Football"
                ],
                "boost" : 1.0
              }
            },
            {
              "terms" : {
                "keywords" : [
                  "Tech"
                ],
                "boost" : 1.0
              }
            }
          ],
          "disable_coord" : false,
          "adjust_pure_negative" : true,
          "boost" : 1.0
        }
      },
      "functions" : [
        {
          "filter" : {
            "term" : {
              "keywords" : {
                "value" : "Football",
                "boost" : 1.0
              }
            }
          },
          "weight" : 3.0
        },
        {
          "filter" : {
            "term" : {
              "keywords" : {
                "value" : "Tech",
                "boost" : 1.0
              }
            }
          },
          "weight" : 1.0
        }
      ],
      "score_mode" : "multiply",
      "max_boost" : 3.4028235E38,
      "boost" : 1.0
    }
  },
  "version" : true
}
Swordfish
  • 1,127
  • 24
  • 46

1 Answers1

0

What you are looking for is the function_score query. Something along the lines of

{
    "query": {
        "function_score": {
            "query": {
                "bool": {
                    "should": [
                        {"term":{"keyword":"Football"}},
                        {"term":{"keyword":"Tech"}},
                        {"term":{"keyword":"Health"}}
                    ]
                }
            },
            "functions": [
                {"filter":{"term":{"keyword":"Football"}},"weight": 3},
                {"filter":{"term":{"keyword":"Tech"}},"weight": 1},
                {"filter":{"term":{"keyword":"Health"}},"weight": 2}
            ]
        }
    }
}

See here for API help https://www.elastic.co/guide/en/elasticsearch/client/java-api/current/java-compound-queries.html#java-query-dsl-function-score-query

sramalingam24
  • 1,297
  • 1
  • 14
  • 19
  • Great thanks. Is this something that is supported by the spring data api? – Swordfish Oct 18 '18 at 16:17
  • Should be according to docs here https://github.com/spring-projects/spring-data-elasticsearch/blob/master/README.md. You may have to dig into the docs to see how to do it, the querystring example in readme would be a good start. – sramalingam24 Oct 18 '18 at 18:04
  • Thanks I managed to do this via the API and added it to the question above. I have a couple of doubts though. What is the difference between boost and weight? and how do I sort by interest preference and then most recent article creation date. I enabled the sort line commented above and the tech article was displayed on top? – Swordfish Oct 19 '18 at 16:39
  • For the document on weights/boost see this https://www.elastic.co/guide/en/elasticsearch/reference/6.4/query-dsl-function-score-query.html – sramalingam24 Oct 19 '18 at 17:37
  • Can you post your results excerpt it is kinda hard to see what want changed without a view of what it is now – sramalingam24 Oct 19 '18 at 17:38