1

There seem to be already a lot of questions/answers to what I'm experiencing, but I couldn't adapt anything I discovered in order to fix my problem.

A have a courses table, and another views table.

The views table is meant to track page views of all the site's modules, for example:

id table id_row ip created_at
1 courses 57 xx timestamp
2 blog 12 xx timestamp

In our CMS, we are trying to get all the courses to a table, while getting their viewcount, but that viewcount is taking too long to render, as there are hundreds of courses and thousands of views.

Here's a simplified version of my query:

SELECT c.*,
(
    SELECT COUNT(*) FROM t_applications AS a
    WHERE a.id_course = c.id
    GROUP BY a.id_course
) AS applications,
(
    SELECT COUNT(*)
    FROM t_views AS v
    WHERE v.table = 'courses'
    AND v.id_row = c.id
    GROUP BY v.id_row
) as views
FROM t_courses AS c;

I've tried switching it up with JOIN and other techniques, but nothing seems to work.

Would appreciate some insight on how to improve this query's performance, as it's taking over a minute to render.

EDIT:

As per requested by some people, here are my SHOW CREATE TABLE of all 3 tables, again, simplified.

CREATE TABLE `t_courses`
(
    `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
    `title` varchar(255) COLLATE utf8mb4_unicode_520_ci NOT NULL,
    -- other data
    `created_at` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
    `updated_at` timestamp NULL DEFAULT NULL ON UPDATE CURRENT_TIMESTAMP,
    PRIMARY KEY (`id`)
) ENGINE=MyISAM AUTO_INCREMENT=516 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_520_ci

CREATE TABLE `t_views` (
    `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
    `table` varchar(255) COLLATE utf8mb4_unicode_520_ci NOT NULL,
    `id_row` bigint(20) unsigned NOT NULL,
    `ip` varchar(255) COLLATE utf8mb4_unicode_520_ci NOT NULL,
    `created_at` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
    PRIMARY KEY (`id`)
) ENGINE=MyISAM AUTO_INCREMENT=47032 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_520_ci

CREATE TABLE `t_applications` (
    `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
    `id_course` bigint(20) unsigned NOT NULL,
    -- other data
    `created_at` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
    PRIMARY KEY (`id`)
) ENGINE=MyISAM AUTO_INCREMENT=4177 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_520_ci
Bruno Braga
  • 25
  • 1
  • 7
  • 2
    What's your execute plan and index? – D-Shih Jan 03 '22 at 09:53
  • Historically, explicit joins usually win, hence the established wisdom that joins are better, but optimizers are getting better all the time, and so I prefer to write queries first in a logically coherent way, and then restructure if performance constraints warrant this. – Raushan Kumar Jan 03 '22 at 10:26
  • **You need to show us the table and index definitions**, as well as row counts for each of the tables. Maybe your tables are defined poorly. Maybe the indexes aren't created correctly. Maybe you don't have an index on that column you thought you did. Without seeing the table and index definitions, we can't tell. We need row counts because that can affect query planning. If you know how to do an `EXPLAIN` or get an execution plan, put the results in the question as well. If you have no indexes, visit http://use-the-index-luke.com. – Andy Lester Jan 03 '22 at 19:17
  • @AndyLester I've edited my question, providing the a `SHOW CREATE TABLE` for all 3 tables. You can check the row count by the AUTO_INCREMENT. Thanks for your time. – Bruno Braga Jan 04 '22 at 10:42
  • So you have no indexes besides the primary keys. You need some. Check out https://use-the-index-luke.com to go from there. – Andy Lester Jan 04 '22 at 14:13

2 Answers2

1

Running a subquery in a SELECT clause is often very slow (I'm not sure but I think it runs the subquery for each row of the main query).

Using a JOIN should be more efficient as there would be a single query:

SELECT c.*,
    COUNT(DISTINCT a.id_course) AS applications,
    COUNT(DISTINCT v.id_row) as views
FROM t_courses AS c
    LEFT JOIN t_applications AS a ON c.id = a.id_course
    LEFT JOIN t_views AS v ON c.id = v.id_row
WHERE v.table = 'courses'
GROUP BY c.id;

The use of DISTINCT in the COUNT replaces your GROUP BY in the subqueries and will avoid to count the same t_applications/t_views rows multiple time.
The use of LEFT JOIN (instead of INNER JOIN) ensures that you get every rows from t_courses even if there's no matching row in the joined tables.
The ending GROUP BY is needed as a join will duplicate your t_courses rows (combination of rows between joined tables).


It appears that your tables use MyISAM which does not support foreign keys which means you have none. Unless you really need MyISAM, convert your tables to InnoDB (this question might help) to make your relations validated and indexed.

Then add your foreign key constraints:

ALTER TABLE t_applications
ADD CONSTRAINT fk_applications_courses
    FOREIGN KEY (id_course)
    REFERENCES t_courses(id);

ALTER TABLE t_views
ADD CONSTRAINT fk_views_courses
    FOREIGN KEY (id_row)
    REFERENCES t_courses(id);

The original answer should be definitely faster then. If you still need performance boost, manually create indexes on your columns:

CREATE INDEX idx_applications_courses ON t_applications(id_course);
CREATE INDEX idx_views_courses ON t_views(id_row);
AymDev
  • 6,626
  • 4
  • 29
  • 52
  • In phpMyAdmin, my query was taking 3 seconds to render, and after adjusting to yours, it started taking 40 seconds. Either I'm doing something wrong, or the problem is something else entirely. Maybe something about the indexes like the other gentlemen are speaking of? I frankly know little about these good practices, as I learned SQL "on the go". – Bruno Braga Jan 04 '22 at 11:02
  • @BrunoBraga ouch 40s that's bad ! Well I think that you should use **InnoDB** instead of **MyISAM** and set *foreign key constraints* which will ensure that referenced IDs (like `a.id_course` which refers to a `c.id`) are valid and it will automatically create the indexes. Setting foreign keys will definitely improve the performance. I remember in one of my SQL courses where I showed the students a JOIN query with ~8 tables and thousands of records per table. It took more than 2 minutes without foreign keys and less than a second when set. – AymDev Jan 04 '22 at 12:28
  • @BrunoBraga I updated my answer with a foreign key setup. Make sure to make a copy of your actual database in case something breaks during your try so you don't lose data. If it works, repeat on the real database. – AymDev Jan 04 '22 at 12:38
1

Indexes:

a:  INDEX(id_course)  -- or is it the `PRIMARY KEY`?
v:  INDEX(table, id_row)  --  (in either order)

For further discussion, please provide SHOW CREATE TABLE; we need to see the indexes, datatypes, engine, etc.

Rick James
  • 135,179
  • 13
  • 127
  • 222