mariadb create function returning random id

Question

I want to create a database function returning a random row from a database table.

I have following table

create table category
(
    id    bigint       not null primary key,
    color varchar(255) null
);

and I create a function with:

DELIMITER //
create or replace function random_category() returns int
begin
    return (select cc.id from category cc order by rand() limit 1);
end //
DELIMITER ;

When calling select random_category(); I always get a single result.

But when calling

select * from category c where id = random_category();

I receive empty result, multiple line results and single line results.

I am using 10.11.3-MariaDB-1:10.11.3+maria~ubu2204

Do you really want to add a randomiser to the Primary Key? I thought Primary Keys were auto-increment, so making it random shouldn't be necessary to keep an unique value, and is better for a seperate column. — Steven, Jun 08 '23 at 11:42
Is it possible you have oversimplified your question? `select c.*` doesn't work because you haven't assigned `c` as an alias in the query you showed us. Please [edit] your question. — O. Jones, Jun 08 '23 at 11:43
@Steven the use case is that I want to create some dummy data and want to find a random element to assign to another element. — Punching, Jun 09 '23 at 07:32

score 3 · Answer 1 · answered Jun 08 '23 at 14:25

It is not a bug, it is expected behavior.

Let's move your function into a subquery and let's use a sequence with a few values:

# Attempt 1
select seq from seq_1_to_3 where seq=(select seq from seq_1_to_3 order by rand() limit 1);
Empty set (0,001 sec)

# Attempt 2
select seq from seq_1_to_3 where seq=(select seq from seq_1_to_3 order by rand() limit 1);
+-----+
| seq |
+-----+
|   3 |
+-----+
1 row in set (0,002 sec)

Since we use a subquery instead of a function, EXPLAIN will be a little bit more verbose:

explain select seq from seq_1_to_3 where seq=(select seq from seq_1_to_3 order by rand() limit 1);
+------+----------------------+------------+-------+---------------+---------+---------+-------+------+----------------------------------------------+
| id   | select_type          | table      | type  | possible_keys | key     | key_len | ref   | rows | Extra                                        |
+------+----------------------+------------+-------+---------------+---------+---------+-------+------+----------------------------------------------+
|    1 | PRIMARY              | seq_1_to_3 | const | PRIMARY       | PRIMARY | 8       | const | 1    | Using where; Using index                     |
|    2 | UNCACHEABLE SUBQUERY | seq_1_to_3 | index | NULL          | PRIMARY | 8       | NULL  | 3    | Using index; Using temporary; Using filesort |
+------+----------------------+------------+-------+---------------+---------+---------+-------+------+----------------------------------------------+

UNCACHEABLE SUBQUERY means, that the result of the subquery cannot be stored in subquery cache and must be executed for each comparison.

Let's assume in first attempt the subquery returned 3,1 and 1, in second it returned 2,1 and 3. In first attempt there was no matching ( 1 != 3, 2 != 1 and 3 != 1), while in 2nd attempt 3 matched 3.

See also Correlated Subqueries (Wikipedia).

To avoid this, you could just change your SQL statement to

SELECT * from category ORDER BY RAND() LIMIT 1

However ORDER BY RAND() is very slow, I would suggest you to read Rick James' excellent article "Fetching random rows from a table".

Thank you so much for this explanation! This helps a lot to understand. — Punching, Jun 09 '23 at 07:33

Akina · Answer 2 · 2023-06-08T12:07:42.490

Your function is called for each separate row independently. Each call generates new single id value. So the amount of output rows vary.

You must call the function once. For example, with

select category.* 
from category 
JOIN (SELECT random_category() AS id) AS criteria USING (id);

Also you may try to define your function as DETERMINISTIC:

create or replace function random_category() returns int DETERMINISTIC
begin
    return (select cc.id from category cc order by rand() limit 1);
end

AFAIR in this case the function output is treated as a constant (it have no arguments), and it should be called once.. but I'm not sure.

mariadb create function returning random id

2 Answers2

Linked