13

I am working on an Airbnb-like website and I am in the process of rewriting our in-house, SQL-based search system with Algolia. It's been a really pleasant journey so far, as I have managed to remove a lot of legacy code and outsource it, with awesome results. However, there is one critical piece of our search system which I am not sure can be implemented with Algolia.

Internally, we store the availability/unavailability (and price) of each date for each asset as a single row in the database. This means our availabilities table looks like this:

asset_id | date       | status      | price_cents
-------- | ---------- | ----------- | -----------
1        | 2017-02-09 | available   | 15000
1        | 2017-02-10 | available   | 15000
1        | 2017-02-11 | unavailable | NULL
1        | 2017-02-12 | available   | 20000

When a user searches for available properties, they enter a date range and, optionally, a price range.

What we're doing now is simply querying the availabilities table and making sure that all dates in the date range are available for that asset (i.e. the count of available dates is equal to the number of days in the range). If the user enters a price range, we also make sure that the average price for those dates is within the requested range. The SQL query is fairly complex, but this is what it does at the end of the day.

I have been trying to replicate this with Algolia, but couldn't find any documentation about a similar feature. In fact, I am facing two separate issues right now:

  • I have no way to ensure all dates in the provided date range are available, because Algolia has little to no knowledge about associations, and
  • I have no way to calculate (and query) the average price for the provided date range, because it depends on user input (i.e. the date range).

Is there a way to achieve this with Algolia? If not, is it feasible to use SQL or another tool in combination with Algolia to achieve the desired result? Of course, I could do all of this with Elasticsearch, but Algolia is so fast and easy that I'd hate to step away from it because of these issues.

Alessandro Desantis
  • 14,098
  • 1
  • 26
  • 32
  • Have you found a solution to this? I'm very curious to know how it's done. I've randomly been checking out a housing rental website and I thought of this and now I really need to scratch this itch! – Rami Awar Aug 27 '22 at 20:32

1 Answers1

2

This use-case is definitely complex, and Algolia needs precomputed data in order to work.


Edit 2020 (better solution)

In each item, you could simply store the list of days where the location is available, e.g.

{
  name: "2 bedroom appartment",
  location: "Paris",
  availabilities: ['2020-04-27', '2020-04-28', '2020-04-30']
  price_cents: 30000
}

You could then, at search time, generate the list of all the availabilities you require your items to have, e.g. (available from April 28th to April 30th):

index.search('', {
  filters: '' +
    'availabilities:2020-04-28 AND availabilities:2020-04-29 AND availabilities:2020-04-30 AND ' +
    'price_cents >= ' + lowPriceRange + ' AND price_cents <= ' + highPriceRange 
}) 

In this example, the record wouldn't match as it lacks 2020-04-29.


Another solution, which works more generically, but requires way more records:

I'm assuming there is a cap of the amount of days in advance you can book, I'll assume here it's 90 days.
You could generate every date range possible inside those 90 days.
This would mean generating 90 + 89 + ... = 90 * 91 / 2 = 4095 date ranges.
Then for each of those ranges, and each of the flats you're offering on your service, you could generate an object like this:

{
  name: "2 bedroom appartment",
  location: "Paris",
  availability_range: "2017-02-09 -> 2017-02-10",
  availability_start_timestamp: 10001000,
  availability_end_timestamp: 10002000,
  price_cents: 30000
}

With those objects, then searching for an date range would be as easy as:

index.search('', {
  filters: '' +
    'availability_range:"' + startDate + ' -> ' + endDate + '" AND ' +
    'price_cents >= ' + lowPriceRange + ' AND price_cents <= ' + highPriceRange 
}) 

You would only be indexing available time ranges, so this should greatly reduce the amount of objects, but it would still be probably huge.

Finally, the timestamps in the object would be here to know which ones to delete when a booking is made. The call would be something like:

index.deleteByQuery('', {
  filters: 'availability_start_timestamp < ' + booking_end_timestamp + ' AND availability_end_timestamp > ' + booking_start_timestamp
})
Jerska
  • 11,722
  • 4
  • 35
  • 54
  • Unfortunately we store availabilities for the entire year. Is there any way I can use SQL to filter before or after Algolia? – Alessandro Desantis Feb 09 '17 at 20:46
  • Then you would be looking at 66795 date ranges per flat, which would probably be way too expensive, indeed. – Jerska Feb 10 '17 at 13:18
  • Is the list of results always expected to be low (<1000)? For instance, are you also restricting by location? If you are, you could maybe use Algolia to get a list of objectIDs matching the text + location query and send this list of IDs to MySQL to compute the available ones out of this list. – Jerska Feb 10 '17 at 13:22
  • You would lose Algolia's speed but would still benefit from its textual relevance. – Jerska Feb 10 '17 at 13:23
  • For the time being, we won't have many results, but going forward we might. How well would such a solution scale? What techniques could I adopt to make it more scalable? – Alessandro Desantis Feb 10 '17 at 16:45
  • Well, as soon as your Algolia results list goes over 1000 items, you'll probably get stuck. You can increase this limit by an order of magnitude, but this won't be infinitely scalable. But I would guess that by then, you would have resources who would be dedicated on working on that. :) – Jerska Feb 22 '17 at 10:41