I am working on a data analytics Dashboard for a media content broadcasting company. Even if a user clicks a certain channel, logs/records are stored into MySQL DB. Following is the table that stores data regarding channel play times.
Here is the table structure:
_____________________________________
| ID INT(11) |
_____________________________________
| Channel_ID INT(11) |
_____________________________________
| playing_date (DATE) |
_____________________________________
| country_code VARCHAR(50) |
_____________________________________
| playtime_in_sec INT(11) |
_____________________________________
| count_more_then_30_min_play INT(11) |
_____________________________________
| count_15_30_min_play INT(11) |
_____________________________________
| count_0_15_min_play |
_____________________________________
| channel_report_tag VARCHAR(50) |
_____________________________________
| device_report_tag VARCHAR(50) |
_____________________________________
| genre_report_tag VARCHAR(50) |
_____________________________________
The Query that I run behind one of the dashboard graphs construction is :
SELECT
channel_report_tag,
SUM(count_more_then_30_min_play) AS '>30 minutes',
SUM(count_15_30_min_play) AS '15-30 Minutes',
SUM(count_0_15_min_play) AS '0-15 Minutes'
FROM
channel_play_times_cleaned
WHERE
playing_date BETWEEN '' AND ''
AND country_code LIKE ''
AND device_report_tag LIKE ''
AND channel_report_tag LIKE ''
GROUP BY
channel_report_tag
LIMIT 10
This query basically is taking a lot of time to return the result set (given the table data exceeds a million records per day and increasing every second ). I came across this stack-overflow Question : What generic techniques can be applied to optimize SQL queries? which basically mentions employing indices as one the techniques to optimize SQL queries. At the moment I am confused how to apply indices (i.e on what columns) in order to optimize the above mentioned query. I would be very grateful if some one could offer help in creating indices according to my specific scenario. Any other expert opinion for a beginner like me are surely welcomed.
EDIT :
As suggested by @Thomas G ,
I have tried to improve my query and make it more specific :
SELECT
channel_report_tag,
SUM(count_more_then_30_min_play) AS '>30 minutes',
SUM(count_15_30_min_play) AS '15-30 Minutes',
SUM(count_0_15_min_play) AS '0-15 Minutes'
FROM
channel_play_times_cleaned
WHERE
playing_date BETWEEN '' AND ''
AND country_code = 'US'
AND device_report_tag = 'j8'
AND channel_report_tag = 'NAT GEO'
GROUP BY
channel_report_tag
LIMIT 10