Initialise user defined variable in mysql with in a subquery

Question

I was going through this answer How do you select every n-th row from mysql. In that I am not able to understand the initialisation in the following subquery.

SELECT 
    @row := @row +1 AS rownum, [column name] 
FROM ( 
    SELECT @row :=0) r, [table name]

How exactly the initialisation of

SELECT @row :=0

is working?

Is some kind of join happening between table ‘r’ and ‘table name’?

If I change above query as below, would there be any difference in the performance?

SET @row = 0;

SELECT @row := @row +1 AS rownum, [column name] FROM [table name]

Please share your thoughts.

It is a comma based old-style Implicit Join syntax. You should avoid it and use Modern [Explicit JOIN based syntax](https://stackoverflow.com/q/5654278/2469308) — Madhur Bhaiya, Oct 01 '19 at 16:31
First form works by chance rather than by design, and it might stop working at any moment. Second form is better but if you want Nth row, upgrade to MySQL 8+ and use `row_number() over(...)` — Caius Jard, Oct 01 '19 at 16:32
Now, in this problem, it is a JOIN without any matching condition. So basically, every row from first table is joined to every other row from second table (a cartesian product), which is formally called CROSS JOIN. And this CROSS JOIN ensures that every row has access to the initialized variable — Madhur Bhaiya, Oct 01 '19 at 16:33
@CaiusJard: the first form relies on behavior that is *not guaranteed*; describing that as "by chance" and "might stop working at any moment" misses the mark. The observed behavior is repeatable. It's the result of code paths in MySQL server, not the result of random events. Maybe what you were meaning to point out is the MySQL Reference Manual specific warning about there not being any guarantee as to the order of operations when assigning and evaluating user-defined variables within a single statement. The order of operations might be unexpected, but it's not determined "by chance". — spencer7593, Oct 02 '19 at 14:19
@spencer7593 OK, that's the level of detail I was considering going to but didn't [need to, because someone else would surely be along to fill in]. By "at any moment" I meant the random event of "when the DBA upgrades to MySQL version X whereupon this 'feature' goes away, long after you've left the company, silently breaking your code with the possibility that there's no-one around to fix it quickly". => Use a ratified, documented, supported feature — Caius Jard, Oct 02 '19 at 15:48
excerpts from MySQL 8.0 Reference Manual: https://dev.mysql.com/doc/refman/8.0/en/user-variables.html "The order of evaluation for expressions involving user variables is undefined. For example, there is no guarantee that `SELECT @a, @a:=@a+1` evaluates @a first and then performs the assignment." and "Previous releases of MySQL made it possible to assign a value to a user variable in statements other than `SET`. This functionality is supported in MySQL 8.0 for backward compatibility but is subject to removal in a future release of MySQL." — spencer7593, Oct 02 '19 at 16:47

score 2 · Accepted Answer · answered Oct 01 '19 at 17:19

Using two statements, initializing the user-defined variable in a separate statement will be equivalent performance.

Instead of the SET statement, we could do

SELECT @row : = 0

which would achieve the same result, assigning a value to the user-defined variable @row. The difference would that MySQL needs to prepare a resultset to be returned to the client. We avoid that with the SET statement, which doesn't return a resultset.

With two separate statement executions, there's the overhead of sending an extra statement: parsing tokens, syntax check, semantic check, ... and returning the status to the client. It's a small amount of overhead. We aren't going to notice it onesie-twosie.

So performance will be equivalent.

I strongly recommend ditching the oldschool comma syntax for join operation, and using the JOIN keyword instead.

Consider the query:

SELECT t.foo 
  FROM r 
 CROSS
  JOIN t
 ORDER BY t.foo

What happens when the table r is guaranteed to contain exactly one row?

The query is equivalent to:

SELECT t.foo 
  FROM t
 ORDER BY t.foo

We can use a SELECT query in place of a table or view. Consider for example:

SELECT v.foo 
  FROM ( SELECT t.foo 
           FROM t
       ) v

Also consider what happens with this query:

SELECT @foo := 0

There is no FROM clause (or Oracle-style FROM dual), so the query will return a single row. The expression in the SELECT list is evaluated... the constant value 0 is assigned to the user-defined variable @foo.

Consider this query:

SELECT 'bar'
  FROM ( SELECT @foo := 0 ) r

Before the outer query runs, the SELECT inside the parens is executed. (MySQL calls it an "derived table" but more generically it's an inline view definition.) The net effect is that the constant 0 is assigned to the user-defined variable, and a single row is returned. So the outer query returns a single row.

If we understand that, we have what we need to understand what is happening here:

 SELECT t.mycol
   FROM ( SELECT @row := 0 ) r
  CROSS
   JOIN mytable t
  ORDER
     BY t.mycol

Inline view r is evaluated, the SELECT returns a single row, the value 0 is assigned to user-defined variable @row. Since r is guaranteed to return a single row, we know that the Cartesian product (cross join) with mytable will result in one row for each row in mytable. Effectively yielding just a copy of mytable.

To answer the question that wasn't asked:

The benefit of doing the initialization within the statement rather than a separate statement is that we now have a single statement that stands alone. It knocks out a dependency i.e. doesn't require a separate execution of a SET statement to assign the user defined variable. Which also cuts out a roundtrip to the database to prepare and execute a separate statement.

Spencer, I had one doubt, If I use the set way then I don't need to cross join right? should not following enough? SET @row = 0; SELECT /@row := /@row +1 AS rownum, [column name] FROM [table name] — nantitv, Oct 03 '19 at 15:09
Added / before @row to avoid stack overflow warning " you can not refer these many users" — nantitv, Oct 03 '19 at 15:10
I ran the above queries and it worked for me. I want to ensure that it was not working accidentally — nantitv, Oct 03 '19 at 15:11
The reason for removing cross join was, I have big table with million rows and I found "set @row = 0" version took less execution time — nantitv, Oct 03 '19 at 15:12
Using MySQL 8.0 (and no need for backwards compatibility for earlier versions), my preference would be to use `ROW_NUMBER` window (analytic) function https://dev.mysql.com/doc/refman/8.0/en/window-function-descriptions.html#function_row-number — spencer7593, Oct 03 '19 at 15:20

Initialise user defined variable in mysql with in a subquery

1 Answers1