1

I have a query that I want to give someone permissions to view. The query is pretty basic but along the lines of:

-- In actuality, this is a 60 line SQL query joining 5 tables
SELECT * FROM
<a bunch of joined tables>
GROUP by a.id

Normally this view is queried on an id (which is indexed on the initial table), something like this:

SELECT * FROM ...
WHERE id < 10 AND row_last_modified > '2021-05-14 04:42:47'

This normally executes in about 10ms when running the normal query (against the tables, not the view). However, as soon as I create this view, it seems to do the GROUP BY and aggregate on every single row before doing anything else, so even if I grab one ID, the query still takes about 5 minutes to run, making the view entirely unusable. Is there a way around this in mysql, or do I just have to give the end user access to all those tables to do the query instead of the view?

(I suppose another option would be to create the view without the GROUP BY and just tell the user "You must tack on this GROUP BY a.id for the query to work or else you will get garbage results.")

The only thing I've read to deal with this is the pushdown offered in mysql8: https://dev.mysql.com/doc/refman/8.0/en/derived-condition-pushdown-optimization.html, but maybe I'm missing something here (fingers crossed) and there's an easy way to do this!


Here is the explain plan:

explain SELECT * from global_view where info_id=1;

enter image description here

The 765375 rows scanned on the derived table seems to be the killer.

David542
  • 104,438
  • 178
  • 489
  • 842
  • I assume the query on the tables have a `GROUP BY` as well. – The Impaler May 18 '21 at 03:10
  • @TheImpaler it shouldn't (if the view itself has the `group by id`) -- but yes ultimately every query needs to have that group by. – David542 May 18 '21 at 03:10
  • In the query on the tables, where does the columns in `WHERE id < 10 AND row_last_modified` come from? Both from table `a`? – The Impaler May 18 '21 at 03:11
  • Are all inner joins, or do you have outer joins as well? – The Impaler May 18 '21 at 03:11
  • @TheImpaler `id` is on `a` but `row_last_modified` is on another table. – David542 May 18 '21 at 03:12
  • @TheImpaler most of them are (left) outer joins. – David542 May 18 '21 at 03:12
  • If you are only grouping by `a.id`, then `row_last_modified` is an aggregated column (result from a `SUM()`, `MIN()`, `MAX()`, etc). Is this correct? – The Impaler May 18 '21 at 03:15
  • @TheImpaler hmm. The `row_last_modified` is on an inner join and that's a one-to-one relationship between those two tables, if that makes sense. – David542 May 18 '21 at 03:17
  • 1
    You know the relationship is 1:1 but the optimizer probably considers it a 1:N. I would suggest you retrieve the execution plan of both queries and add them to the question. Also, I suggest you post this question at dba.stackexchange.com (where you can get better answers). – The Impaler May 18 '21 at 03:20
  • @TheImpaler actually, even if I remove the `row_last_modified` and just do `id=5` or whatever, it still takes ~5 minutes to run. Also, updated with `explain`. – David542 May 18 '21 at 03:22
  • Just a note: I can only think the optimizer of MySQL 5.x is not very smart. MySQL 8 is bit more clever, specially the latest versions (8.0.20+). That engine is more sophisticated after it got a boost from the Oracle folks. Maybe that could be an option. – The Impaler May 18 '21 at 03:27
  • Nothing that starts `SELECT *` and ends `GROUP BY` is ever going to go well – Strawberry May 18 '21 at 06:28
  • We *may* be able to give you better advice if you show us the actual definitions of your tables -- with indexes -- and the actual definition of your view. You are using VIEW as a way of restricting access. You might consider using a stored procedure for that, if the VIEW continues to fail you. – O. Jones May 18 '21 at 15:02

0 Answers0