-1

I need to find median of total scores region wise. I have got the solution after trial and error on data but the query is not in a optimized way. i need a efficient mysql query for this problem.

#Thanks for the solutions#

Edit: first exam has to be filter from assessment table and second total_score needs to be summed for all subject of each student using studentassessment table. Then finally region wise median needs to be calculated.

database schema

SELECT region,
      Avg(total_score) AS median
FROM   (SELECT row_num,
              region,
              total_score,
              region_cnt
       FROM   (SELECT Row_number()
                        OVER (
                          partition BY region
                          ORDER BY total_score) AS row_num,
                      region,
                      total_score,
                      Count(region)
                        OVER (
                          partition BY region)  AS region_cnt
               FROM   (SELECT i.region     AS region,
                              Sum(S.score) AS total_score
                       FROM   tredence.assesment A
                              INNER JOIN tredence.studentassessment S
                                      ON A.id_assessment = S.id_assessment
                              INNER JOIN tredence.studentinfo i
                                      ON i.id_student = S.id_student
                       WHERE  A.assessment = 'Exam'
                       GROUP  BY S.id_student,
                                 i.region
                       ORDER  BY region,
                                 total_score) t) r
       GROUP  BY 1,
                 2,
                 3,
                 4
       HAVING row_num IN ( Floor(region_cnt / 2), Ceil(region_cnt / 2) )) z
GROUP  BY region
ORDER  BY median DESC ```

tables and columns:
|Assessments    |student_info|student_assessment|
|---------------|------------|------------------|
|course_code    |course_code |id_assessment     |
|batch_code     |batch_code  |id_student        |
|id_assessments |id_student  |date_submitted    |
|assessment_type|gender      |is_banked         |
|date           |region      |score             |

Output:
|Region       |Median|
|-------------|------|
|North Region | 82   |
|London Region| 80   |
|Scotland     | 80   |
|Ireland      | 76   |
shoaib sipai
  • 155
  • 2
  • 10
  • 1
    Does this answer your question? [Simple way to calculate median with MySQL](https://stackoverflow.com/questions/1291152/simple-way-to-calculate-median-with-mysql) – nbk Jul 04 '21 at 16:26

1 Answers1

1

Assuming you reduce the set to the following. Note: id_student isn't required at this point in the calculation.

CREATE TABLE tscores (
    id          int primary key auto_increment
  , region      int
  , id_student  int
  , total_score int
  , index (region, total_score)
);

INSERT INTO tscores (region, id_student, total_score) VALUES
    (1, 1000, 40)
  , (1, 1001, 50)
  , (1, 1002, 30)
  , (1, 1003, 90)
  , (2, 1101, 50)
  , (2, 1102, 51)
  , (2, 1103, 55)
;

SQL and Result:

WITH cte1 AS (
        SELECT region, total_score
             , ((COUNT(*) OVER (PARTITION BY region) + 1) / 2) AS n
             , ROW_NUMBER() OVER (PARTITION BY region ORDER BY total_score) AS rn
          FROM tscores AS t
     )
SELECT region
     , truncate(AVG(total_score), 2) AS med_score
  FROM cte1 AS t
 WHERE rn IN (ceil(n), floor(n))
 GROUP BY region
;

+--------+-----------+
| region | med_score |
+--------+-----------+
|      1 |     45.00 |
|      2 |     51.00 |
+--------+-----------+
2 rows in set (0.004 sec)

Still not quite enough detail. But here's SQL that runs against your schema, minus the typos I think you had in your SQL:

WITH tscores AS (
                     SELECT i.region     AS region
                          , Sum(S.score) AS total_score
                       FROM tredence.assessments A
                       JOIN tredence.studentassessment S
                         ON A.id_assessment = S.id_assessment
                       JOIN tredence.studentinfo i
                         ON i.id_student = S.id_student
                      WHERE A.assessment = 'Exam'
                      GROUP BY S.id_student
                             , i.region
     )
   , cte1 AS (
        SELECT region, total_score
             , ((COUNT(*) OVER (PARTITION BY region) + 1) / 2) AS n
             , ROW_NUMBER() OVER (PARTITION BY region ORDER BY total_score) AS rn
          FROM tscores AS t
     )
SELECT region
     , truncate(AVG(total_score), 2) AS med_score
  FROM cte1 AS t
 WHERE rn IN (ceil(n), floor(n))
 GROUP BY region
;
Jon Armstrong
  • 4,654
  • 2
  • 12
  • 14
  • 1
    When asking a performance related question, it's usually a good idea to provide the exact schema (create table statements), along with all existing indexes, plus the explain plan result for the SQL in question. Some test data (in the form of insert statements) would be helpful as well. I don't have enough points to comment directly to the poster. :) – Jon Armstrong Jul 04 '21 at 16:23
  • i have added the database schema. i think now it will be easier to get to solution. please check. – shoaib sipai Jul 05 '21 at 07:15