I'm new to Elastic. I'm attempting to do a proof-of-concept for professional reasons. So far I'm very impressed. I've indexed a bunch of data and have run a few queries - almost all of which are super fast (thumbs up).
The only issue I'm encountering is that my date range query seems relatively slow compared to all my other queries. We're talking 1000ms+ compared to <100ms for everything else.
I am using the NEST .NET library.
My document structure looks like this:
{
"tourId":"ABC123",
"tourName":"Super cool tour",
"duration":12,
"countryCode":"MM",
"regionCode":"AS",
...
"availability":[
{
"startDate":"2021-02-01T00:00:00",
...
},
{
"startDate":"2021-01-11T00:00:00",
...
}
]
}
I'm trying to get all tours which have availability within a certain month. I am using a date range to do this. I'm not sure if there's a more efficient way to do this? Please let me know if so.
I have tried the following two query:
var response = await elastic.SearchAsync<Tour>(s => s
.Query(q => q
.Nested(n => n
.Path(p => p.Availability)
.Query(nq => nq
.DateRange(r => r
.Field(f => f.Availability.First().StartDate)
.GreaterThanOrEquals(new DateTime(2020, 07, 01))
.LessThan(new DateTime(2020, 08, 01))
)
)
)
)
.Size(20)
.Source(s => s.IncludeAll().Excludes(e => e.Fields(f => f.Availability)))
);
I basically followed the example on their documentation here: https://www.elastic.co/guide/en/elasticsearch/client/net-api/current/writing-queries.html#structured-search but I'm not sure that this is the best way for me to achieve this. Is it just that a date range is naturally slower than other queries or am I just doing it wrong?!
EDIT:
I tried added a new field named YearMonth
which was just an integer representing the year and month for each availability in the format yyyyMM
and querying against this. The timing was also around one second. This makes me wonder whether it's not actually an issue with the date but something else entirely.
I have run a profiler on my query and the result is below. I have no idea what most of it means so if someone does and can give me some help that'd be great:
Query:
var response = await elastic.SearchAsync<Tour>(s => s
.Query(q => q
.Nested(n => n
.Path(p => p.Availability)
.Query(nq => nq
.Term(t => t
.Field(f => f.Availability.First().YearMonth)
.Value(202007)
)
)
)
)
.Size(20)
.Source(s => s.IncludeAll().Excludes(e => e.Fields(f => f.Availability)))
.Profile()
);
Profile:
{
"Shards":[
{
"Aggregations":[
],
"Id":"[pr4Os3Y7RT-gXRWR0gxoEQ][tours][0]",
"Searches":[
{
"Collector":[
{
"Children":[
{
"Children":[
],
"Name":"SimpleTopDocsCollectorContext",
"Reason":"search_top_hits",
"TimeInNanoseconds":6589867
}
],
"Name":"CancellableCollector",
"Reason":"search_cancelled",
"TimeInNanoseconds":13981165
}
],
"Query":[
{
"Breakdown":{
"Advance":5568,
"BuildScorer":2204354,
"CreateWeight":25661,
"Match":0,
"NextDoc":3650375,
"Score":3795517
},
"Children":null,
"Description":"ToParentBlockJoinQuery (availability.yearMonth:[202007 TO 202007])",
"TimeInNanoseconds":9686512,
"Type":"ESToParentBlockJoinQuery"
}
],
"RewriteTime":36118
}
]
}
]
}