0

My problem is the following: I have a query that searches for all the products containing an input string, and from that result set I am picking 10 random entries. The query looks something like this:

SELECT * FROM products WHERE text &@* 'salt' AND random() < 0.01 LIMIT 10 OFFSET 0;

This works flawlessly as long as the result set is big. But if the search for the string returns a handful of rows (in the range of 10 or below), chances are that I will get nothing back. So for a result set size <= 10 a value of random() < 1 will work better, but this won't work properly for big sets as I will start getting the same thing all the time. Implementing this logic at the middleware level can be done but obviously with performance penalties.

So my question is, is there a way to define this condition dynamically at the SQL level, based on the first select result set size?

My PostgreSQL version is : PostgreSQL 15.2 (Debian 15.2-1.pgdg110+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 10.2.1-6) 10.2.1 20210110, 64-bit

PS: I want to stick with random() < # approach to get random values as it is the fastest approach I was able to test so far. And speed is very important in my case.

Yonoss
  • 1,242
  • 5
  • 25
  • 41
  • Have you tried `SELECT * FROM products WHERE text &@* 'salt' order by random() LIMIT 10 ;`? – Adrian Klaver Mar 27 '23 at 16:01
  • Yes but this approach is slower than AND random() < # – Yonoss Mar 27 '23 at 16:20
  • Does this answer your question? [Best way to select random rows PostgreSQL](https://stackoverflow.com/questions/8674718/best-way-to-select-random-rows-postgresql) – Bergi Mar 27 '23 at 16:35
  • 1
    Yes, `order by random()` will be slower, how much depends on data volumn. But it always gives the correct answer even with small data volumes, which not doing so is your issue. *A slower query which gives the correct result is always preferred to an extremely fast one that gives incorrect results.* – Belayer Mar 27 '23 at 18:16

0 Answers0