1

I am in the process of updating our system from Solr 4.1.0 to Solr 8.1.4. (Yes, I understand that is not the latest version available, but that is what has been approved for our system).

We regularly submit queries to find documents that "overlap" a time range. Let's say we have indexed fields "starttime_date" and "endtime_date". In case it matters, these fields were indexed as type TrieDateField in Solr 4.1.0, and in Solr 8.1.4 the fields are of type DatePointField.

Part of these "find overlapping documents" queries is to include any document that doesn't have an endtime_date value yet. So, the query would look like this:

(starttime_date:[* TO 2021-02-19T17:00:00.000Z] AND (endtime_date:[2021-02-19T15:00:00.000Z] OR (*:* NOT endtime_date:*)))

This should find all documents that started before 02/19/2021 at 17:00Z, and either haven't ended, or ended before 02/19/2021 at 15:00Z. I have it wrapped in parens here because this group of clauses is almost always "AND"ed with other clauses. Those other clauses are not what I am concerned about for this question.

This solution was built based on this answer to a similar question: https://stackoverflow.com/a/28859224/3586783

This solution worked in Solr 4.1.0, but doesn't appear to work in Solr 8.1.4. As soon as I add the OR (*:* NOT endtime_date:*) clause, it seems to match all documents. I have tried using -endtime_date:*, -endtime_date:[* TO *], !endtime_date:*, !endtime_date:[* TO *], and none of these have worked.

Is this something related to the change in field type (TrieDateField to DatePointField)? Our query syntax has not changed, but it appears that Solr is processing the query differently now.

Please let me know if more information is needed to understand the issue.

pacifier21
  • 813
  • 1
  • 5
  • 13
  • You're missing any interval specification for your `end_date`, but I assume that's just an error when adding it to your question. Do you have an example of a document being returned that shouldn't have been returned if you're only querying for `*:* -end_date:[* TO *]`? Make sure your Solr library (if you're using one) isn't escaping the `*`'s in your query as well. – MatsLindh Feb 19 '21 at 22:57
  • Using Solr 4.1.0, the query would work as I wrote it - wiithout any interval specified for the endtime_date that is within the NOT clause. The same query executed on data indexed in Solr 8.1.4 does not have the same affect. If I remove everything after the "OR", then it will return documents that have a start time and end time that fall within the requested ranges. As soon as I add the OR back into the mix, it will return all documents. This is the problem I'm trying to address. Is there a different way (syntax, clause grouping, etc) that will provide the expected results? – pacifier21 Feb 20 '21 at 19:57
  • Which is why I asked you to only query for the `OR` part - does it give the results you expect then (i.e. does the set returned by that query match what you'd expect - if it does, then something else is weird)? Which query parser are you using, and do you have a minimal query, two minimal documents that show the wrong behaviour and a minimal schema definition that show it? Create a fresh collection and add your example there to replicate the issue. – MatsLindh Feb 20 '21 at 20:13
  • If I only use any of the following: `*:* NOT endtime_date:*`, `*:* -endtime_date:*`, `*:* !endtime_date:*`, `*:* NOT endtime_date:[* TO *]`, `*:* -endtime_date:[* TO *]`, `*:* !endtime_date:[* TO *]` then I get essentially all documents. Does this indicate a schema issue? My schema is set to index any "*_date" fields as DatePointField (used to be TrieDateField). Are DatePointField types indexed differently than TrieDateField types in a way that has changed this query functionality? – pacifier21 Feb 22 '21 at 16:24
  • Apparently, `*:* NOT endtime_date:[* TO *]` seems to work (even though I suggested before that it didn't). Should I expect this same syntax (e.g., providing a range [* TO *]) to work for other types of indexed fields? Booleans? Strings/Text? Or, is the syntax different for other indexed field types? Thanks for your patience and help! – pacifier21 Feb 23 '21 at 19:19
  • Good! That lines up with my expectation. I'd think that `(*:* -endtime_date:[* TO *])` would as well. It should work for any field type. In effect it means that you want a range "from the start of all tokens to the end of all tokens for this field"; if the field has any tokens at all, it will match. – MatsLindh Feb 23 '21 at 19:43

0 Answers0