0

I've migrated the database from AWS Aurora 1.22.3 (which is compatible with MySQL 5.6) to MySQL 8 and have one query which return the account and it's parent account (or child account if it is parent account).

The account table will have:

account_id (primary key), parent_account_id (foreign key), account_name,...

For example I have account 520 with parent account 519:

account_id,tenant_id,parent_account_id,account_name,account_code,account_class,account_type,account_description,status,is_master_account,currency_code,tax_type,pending_balance,authorised_balance,total_balance,related_party_id,creator_id
519,1,NULL,"SANTOS LIMITED - Trade1",000604,ASSET,FINRECEIVABLE,NULL,ACTIVE,1,AUD,BASEXC,0.000000,0.000000,0.000000,321,1
520,1,519,"SANTOS LIMITED - Trade Card1",000604-1,ASSET,FINRECEIVABLE,NULL,ACTIVE,0,AUD,BASEXC,0.000000,0.000000,0.000000,321,1

Here is my query:

    SELECT 
        t_all.account_id -- , parent_level, t_all.level 
    FROM ( 
        -- GET ALL CHILDREN 
        SELECT 
            account_id, 
            parent_account_id, 
            null as parent_level, 
            (@l:=@l + 1) AS level 
        FROM 
        ( 
            SELECT 
                account_id, 
                tenant_id, 
                parent_account_id 
            FROM Account 
            ORDER BY 
                parent_account_id, 
                account_id 
        ) account_sorted, 
        ( 
            SELECT @pv := 520, @l := 0, @cl := 0 
        ) initialisation 
        WHERE 
        account_id = @pv OR 
        find_in_set(parent_account_id, @pv) > 0 
        AND 
        @pv := concat(@pv, ',', account_id) 
        
        UNION 
        
        -- GET ALL PARENTS 
        SELECT 
            account_id, 
            parent_account_id, 
            level as parent_level, 
            null as level 
        FROM 
        ( 
            SELECT 
                _id AS account_id, 
                parent_account_id, 
                @cl := @cl + 1 AS level 
            FROM 
            ( 
                SELECT 
                    @r AS _id, 
                    ( 
                        SELECT @r := parent_account_id 
                        FROM Account 
                        WHERE account_id = @r 
                    ) AS parent_account_id, 
                    @l := @l + 1 AS level 
                FROM 
                ( 
                    SELECT @r := 520, @l := 0, @cl := 0 
                ) vars, 
                Account h 
                WHERE 
                    @r <> 0 
                ORDER BY level DESC 
            ) qi 
        ) qo 
    ) as t_all 
    order by level desc , parent_level asc; 

The old version (MySQL 5) will return 3 records (520,519,520), while MySQL 8 only return one account_id 520. The expected output would be the same with old version.

What do you think could cause the issue and what should I do to ensure that query result consistent when migrating database version?

Thank you so much for your help.

Hung
  • 79
  • 2
  • 8
  • 2
    Support for user variables as used above has been deprecated in MySQL 8. You should be using either a recursive CTE and/or analytic functions. Sample data would help here. – Tim Biegeleisen Nov 04 '21 at 07:29
  • Thanks @TimBiegeleisen, I've updated sample data in the question. I should do the search about user variables and see how it affects the query. – Hung Nov 04 '21 at 07:36
  • 2
    Read about recursive hierarchical queries using MySQL 8+. – Tim Biegeleisen Nov 04 '21 at 07:39

1 Answers1

1

The general reason for this behaviour is that MySQL uses some optimization that is not compatible with your use of variables. You use quite a lot of variables, so I am not trying to find out which optimizations hit you here specifically, but see e.g. my answer here for an example. In general, MySQL makes some assumption about your subqueries that are not necessarily correct anymore if the values of your variables changes.

The general solution is to prevent MySQL from doing those optimization, which can, for now, universally be done by materialization. You can achieve this by adding an arbitrary large limit to all of your subqueries, e.g.

... FROM Account 
ORDER BY parent_account_id, account_id 
LIMIT 100000000 -- add this
...

... FROM Account WHERE account_id = @r 
LIMIT 100000000 -- add this
...

This has the effect that MySQL will actually generate all rows of your queries and subsequently evaluate your variables (probably) the way you want them to, so you should get the result you expect. (And if you don't, you probably forgot some limits, so try to add it to more places).

Generally though, although it usually and practically works as expected, the use of variables has officially been fragile even before MySQL 8, see e.g. the documentation

For other statements, such as SELECT, you might get the results you expect, but this is not guaranteed. In the following statement, you might think that MySQL evaluates @a first and then does an assignment second:

 SELECT @a, @a:=@a+1, ...;

However, the order of evaluation for expressions involving user variables is undefined.

And as a more general warning: your way of using variables is deprecated since MySQL 8, and will probably cause a syntax error in some future version of MySQL. You may want to look into (recursive) Common Table Expressions and How to create a MySQL hierarchical recursive query? for some pointers how to rewrite your query without variables.

Solarflare
  • 10,721
  • 2
  • 18
  • 35