0
SELECT
  COUNT(CASE WHEN VALUE = 1 THEN 1 END) AS score_1,
  COUNT(CASE WHEN VALUE = 2 THEN 1 END) AS score_2,
  COUNT(CASE WHEN VALUE = 3 THEN 1 END) AS score_3,
  COUNT(CASE WHEN VALUE = 4 THEN 1 END) AS score_4,
  COUNT(CASE WHEN VALUE = 5 THEN 1 END) AS score_5,
  COUNT(CASE WHEN VALUE = 6 THEN 1 END) AS score_6,
  COUNT(CASE WHEN VALUE = 7 THEN 1 END) AS score_7,
  COUNT(CASE WHEN VALUE = 8 THEN 1 END) AS score_8,
  COUNT(CASE WHEN VALUE = 9 THEN 1 END) AS score_9,
  COUNT(CASE WHEN VALUE = 10 THEN 1 END) AS score_10
FROM
  `answers`
WHERE
`created_at` BETWEEN '2017-01-01 00:00:00' AND '2019-11-30 23:59:59' 

Is there a way to optimize this query, because I have 4 million answer records in my DB, and it runs very slowly?

Maximouse
  • 4,170
  • 1
  • 14
  • 28
Hovo Asatryan
  • 53
  • 1
  • 1
  • 3
  • 1
    What's very slowly? Do you have an index on created_at?And I suspect sum would be better than count. – P.Salmon Feb 19 '20 at 13:16
  • Use `SELECT COUNT() GROUP BY VALUE`, then pivot it using CASEs. – Akina Feb 19 '20 at 13:18
  • Unlikely that the `CASE` statement is the problem here. If indexes are lacking, your request involves a full table scan, that is 4 million records. Add `explain` in front of your statement to see the execution plan. Also, the resultset may be heavy in terms of data since your query spans almost 3 years. And the data has to travel over the network to reach your front-end. – Kate Feb 19 '20 at 16:08
  • No index is likely to help on any variation of this query. – Rick James Feb 19 '20 at 20:13

3 Answers3

0

Try running this one time to create an index:

CREATE INDEX ix_ca on answers(created_at)

That should speed your query up. If you are curious about why, see here:

What is an index in SQL?

mikeb
  • 10,578
  • 7
  • 62
  • 120
  • That index _probably won't_ be used -- assuming that a lot of the table is in that date range, hence it would be faster to ignore the index and simply scan the table. (For a _short_ time span, the index would help a lot.) – Rick James Feb 19 '20 at 20:24
0

You could try add a redundant composite index

create idx1 on table answers(created_at, value)

using redudance in index the query should be result without accessing to table data just using the index content

ScaisEdge
  • 131,976
  • 10
  • 91
  • 107
0

Want it to be 10 times as fast? Use the Data Warehousing technique of buiding and maintaining a "Summary table". In this example the summary table might be

CREATE TABLE subtotals (
    dy DATE NOT NULL,
    `value` ... NOT NULL,   -- TINYINT UNSIGNED ?
    ct SMALLINT UNSIGNED NOT NULL, -- this is 2 bytes, max 65K; change if might be bigger
    PRIMARY KEY(value, dy)  -- or perhaps the opposite order
) ENGINE=InnoDB

Each night you summarize the day's data and build 10 new rows in subtotals.

Then the "report" query becomes

SELECT
  SUM(CASE WHEN VALUE = 1 THEN ct END) AS score_1,
  SUM(CASE WHEN VALUE = 2 THEN ct END) AS score_2,
  SUM(CASE WHEN VALUE = 3 THEN ct END) AS score_3,
  SUM(CASE WHEN VALUE = 4 THEN ct END) AS score_4,
  SUM(CASE WHEN VALUE = 5 THEN ct END) AS score_5,
  SUM(CASE WHEN VALUE = 6 THEN ct END) AS score_6,
  SUM(CASE WHEN VALUE = 7 THEN ct END) AS score_7,
  SUM(CASE WHEN VALUE = 8 THEN ct END) AS score_8,
  SUM(CASE WHEN VALUE = 9 THEN ct END) AS score_9,
  SUM(CASE WHEN VALUE = 10 THEN ct END) AS score_10
FROM
  `subtotals`
WHERE `created_at` >= '2017-01-01'
  AND `created_at`  < '2019-12-01'

Based on what you have provided, there will be about 10K rows in subtotals; that's a lot less to wade through than 4M rows. It might run more than 10 times as fast.

More discussion: http://mysql.rjweb.org/doc.php/summarytables

Rick James
  • 135,179
  • 13
  • 127
  • 222