16

Can someone explain to me what the difference between must_not and filter is in elasticsearch?

E.g. here (taken from elasticsearch definitive guide), why isn't must_not also used for the range?

{
    "bool": {
        "must":     { "match": { "title": "how to make millions" }},
        "must_not": { "match": { "tag":   "spam" }},
        "should": [
            { "match": { "tag": "starred" }}
        ],
        "filter": {
          "range": { "date": { "gte": "2014-01-01" }} 
        }
    }
}

Specifically looking at this documentation, it appears to me that they are exactly the same:

filter: The clause (query) must appear in matching documents. However unlike must the score of the query will be ignored. Filter clauses are executed in filter context, meaning that scoring is ignored and clauses are considered for caching.

must_not: The clause (query) must not appear in the matching documents. Clauses are executed in filter context meaning that scoring is ignored and clauses are considered for caching. Because scoring is ignored, a score of 0 for all documents is returned.

pjpscriv
  • 866
  • 11
  • 20
schneida
  • 729
  • 3
  • 11
  • 37
  • 1
    Basically, filter = must but without scoring and must_not = !must (or !filter) – Val Nov 10 '17 at 15:53
  • I thought so too, but the second documentation suggests that both filter and must_not are executed in the filter context without scoring? – schneida Nov 10 '17 at 15:54
  • It makes no sense to use scoring for a must_not since documents are excluded from the search and hence canot be scored – Val Nov 10 '17 at 15:56
  • you're probably wondering why they didn't name it `filter_not`instead of `must_not`? – Val Nov 10 '17 at 15:56
  • To my understand, if something matches must_not, it is excluded from the result (no scoring) and if something doesn't match filter, it is also excluded from the result (no scoring). Now apart from being negations of each other, they express the same thing, right? Now where is the technical difference, the first documentation would suggest filter is faster, but if both operators allow to express the same properties, why even bother to offer a slower one... – schneida Nov 10 '17 at 15:59
  • 2
    Can you explain how would you create the same `must_not` constraint as above by using `filter` instead of `must_not`? – Val Nov 10 '17 at 16:04
  • Damn you got me there! :-) So basically `filter` is more related to `must` than to `most_not`, but filter runs non scoring! Want to write that in a short answer? – schneida Nov 10 '17 at 16:07
  • 1
    Absolutely right, filter = must but without scoring, nothing more. – Val Nov 10 '17 at 16:08

2 Answers2

24

The filter is used when the matched documents need to be shown in the result, while must_not is used when the matched documents will not be shown in the results. For further analysis:

filter:

  1. It is written in Filter context.
  2. It does not affect the score of the result.
  3. The matched query results will appear in the result.
  4. Exact match based, not partial match.

must_not:

  1. It is written again on the same filter context.
  2. Which means it will not affect the score of the result.
  3. The documents matched with this condition will NOT appear in the result.
  4. Exact match based.

Tabular comparision

Soumendra
  • 1,518
  • 3
  • 27
  • 54
  • Can you elaborate a bit more on `exact match` based ? I' donot follow the affect the particular clause has on `exact match` – V1666 Oct 04 '22 at 00:53
  • @V1666, you may check this: https://stackoverflow.com/a/28768600/5014656 – Soumendra Nov 03 '22 at 04:53
10

Basically, filter = must but without scoring.

must_not expresses a condition that MUST NOT be met, while filter (and must) express conditions that MUST be met in order for a document to be selected.

Val
  • 207,596
  • 13
  • 358
  • 360