0

My application needs to get some basic data from a user table with primary key user_id - and various other data about the user from secondary tables, each of which has user_id as a foreign key. There are a bunch of these secondary tables such as name, addresss, phone, etcetera - things about a person that can change over time.

More specifically, I need only some values from the most recent row from each secondary table. Each table has a "latest" column which is unix timestamp of the most recent UPDATE or INSERT (we must not delete in this application).

The following works correctly:

SELECT u.username, u.user_id, u.password, u.email, u.active
    , n.first , n.middle , n.last
    , uo.organization_id  /* , other_cols_from_other_tables */
FROM user u
    LEFT JOIN user_org uo ON (uo.user_id = u.user_id AND
        uo.latest in (select max(latest) from name uo1
                      where uo1.user_id = u.user_id))
    /* here, other LEFT JOINs like the above one */
WHERE u.username = :username

However, a subquery solution is widely discouraged due to slowness, and some of these queries will run on every request. So I came up with the following that works in some cases and gets rid of the subquery:

SELECT u.username, u.user_id, u.password, u.email, u.active
    , n.first , n.middle , n.last
    , uo.organization_id  /* , other_cols_from_other_tables, etc. */
FROM user u
    INNER JOIN
    ( SELECT user_id, MAX(latest) utd
        FROM user_org
        GROUP BY user_id
    ) uo1 ON uo1.user_id = u.user_id
    LEFT JOIN user_org uo
        ON (uo.user_id = u.user_id and uo.latest = uo1.utd)
    /* here, other clauses like the part from 'FROM' to here */
WHERE u.username = :username

The latter, unfortunately makes a hard dependence on data in the secondary table, so that the whole query fails if data is lacking in any secondary table for the particular user.

I've researched this on SO and www and there are many solutions for avoiding subqueries, but everything I've found on the subject has the issue in the main query, not in a left join.

The logic I need is "if there's data for this user in this secondary table, get the specified column(s) from the most recent row in that table, otherwise a null".

It seems to me that putting a "current row" marker column on the most recent row in each table would avoid the whole issue and run faster than any other solution, but would be against normalization (I would still have to have the 'latest' column to maintain order-able history of previous data).

Is there a solution that gets normalization + speed? This is mariadb so it needs Mysql syntax.

EDIT: Still would like a better way, but decided to go with the extra column. Now the problem described above is avoided, and the SELECT SQL is much simplified and presumably faster. The downside is adding complexity in saves, but SELECT is more frequent.

Thorsten Kettner
  • 89,309
  • 7
  • 49
  • 73
  • Unfortunately mySQL doesn't yet support analytical functions (use a RowNumber limit to 1 ordered by your latest) or cross/Outer apply with a limit 1. There is a way to simulate it similar to what you've done however: https://stackoverflow.com/questions/36869221/cross-outer-apply-in-mysql – xQbert Sep 28 '17 at 15:52
  • `ROW_NUMBER` or `RANK` would be great, but MySQL doesn't support them. `CROSS APPLY` to fetch the last related row would be great, but MySQL doesn't support it. `MAX/MIN KEEP LAST` would be great, but MySQL doesn't support it. So what remains are the approaches you are showing. If it's only one column/value that you need from a table, however, you can use a correlated subquery in the `SELECT` clause with `LIMIT 1`. – Thorsten Kettner Sep 29 '17 at 21:24
  • **EDIT:** You say you are using MariaDB? Why did you tag MySQL then? As of version 10.2 MariaDB supports window functions such as `ROW_NUMBER` or `RANK`. – Thorsten Kettner Sep 29 '17 at 21:29
  • I've changed the tag from MySQL to MariaDB. – Thorsten Kettner Sep 29 '17 at 21:35
  • There is a lot more documentation on Mysql, and Mariadb was supposed to be a drop-in replacement so that it would all apply, so I thought it more helpful to searchers to tag it with 'Mysql'. This feature being different however, suggests that divergence is starting. – stephen_789 Jan 11 '18 at 17:39

1 Answers1

0

MariaDB supports ROW_NUMBER as of version 10.2:

SELECT 
  u.username, 
  u.user_id, 
  u.password, 
  u.email, 
  u.active,
  uo.organization_id,
  ...
FROM user u
LEFT JOIN
(
  select
    user_org.*,
    row_number() over(partition by user_id order by latest desc) as rn
  from user_org
) uo ON uo.user_id = u.user_id AND uo.rn = 1
...
WHERE u.username = :username;
Thorsten Kettner
  • 89,309
  • 7
  • 49
  • 73