SQL - Difficult Query

Question

I want to do a query where for each diagnostic code - ID (this is a column), select the name of the most common medication - p_name (another column) used to treat that condition i.e., the medication name that more often appears associated to prescriptions (table) for that diagnosis.

This is the structure of my prescription table:

| p_name | lab | doctor_VAT | date_timestamp | ID  | dosage | prescription_description |

I started by making a query to count the tuple pairs p_name and ID:

SELECT DISTINCT p.ID, 
p.p_name,
COUNT(*) OVER (PARTITION BY p.p_name, p.ID) AS Cnt
FROM prescription AS p
ORDER BY Cnt DESC

And then to this I tried to apply the "greatest_n_per_group" problem to this (SQL select only rows with max value on a column):

FROM   (SELECT DISTINCT p.ID, 
       p.p_name,
       COUNT(*) OVER (PARTITION BY p.p_name, p.ID) AS Cnt
       FROM prescription AS p
       ORDER BY Cnt DESC) as Tabela
INNER JOIN(
        SELECT Tabela2.ID, MAX(Tabela2.Cnt)
        FROM (SELECT DISTINCT p.ID, 
              p.p_name,
              COUNT(*) OVER (PARTITION BY p.p_name, p.ID) AS Cnt
              FROM prescription AS p
              ORDER BY Cnt DESC) as Tabela2
        GROUP BY Tabela2.ID
)

But this produces errors, am I going at this the right the way? do you suggest a different method?

Which version of MySQL do you use? Windows function (partition by) are supported from version 8. If you use 5.6, 5.7 or older version, it won't works. — DanB, Nov 08 '19 at 19:37
My desired output is what I said in the first paragraph, I have different ID values in the and I want for each of them to say which p_name appears the most. — JoNeedsCHelp, Nov 08 '19 at 19:40
On a side note: There should be a table for diagnostic codes of course, and the diagnostic code ID in the prescription table should then be called something like id_diagnosticcode. Never call a column `ID` when it's not the table's unique ID; that's just misleading and will degrade the queries' readablility a lot. — Thorsten Kettner, Nov 08 '19 at 19:57
@ThorstenKettner. As SQL is declarative in nature meaning you define *"what you want to have"* and not *"how to get it"* when writting `SELECT` , the *"how to get it"* part is defined in the RDMS engine which read the table structure (might be a bit over-simplified).. Pretty sure you know that as that is not the point for making this comment mine point is that without table structure you can only make a educated *"guess"* what the meaning of a query possibly can be... Don't understand me wrong i do agree making usefull names for table and column names to make it a little bit more senseable... — Raymond Nijland, Nov 08 '19 at 20:58
.. As it would be wierd called something `unique_id` and the data shows it isn't unique or table structures does not enforce unique ...that said, Topicstarter can you execute a `SHOW CREATE TABLE prescription` and post those results here.. And follow the instructions in [Why should I provide a Minimal Reproducible Example for a very simple SQL query?](https://meta.stackoverflow.com/questions/333952/why-should-i-provide-a-minimal-reproducible-example-for-a-very-simple-sql-query) — Raymond Nijland, Nov 08 '19 at 20:58

Ankit Bajpai · Answer 1 · 2019-11-08T19:52:18.043

1

Since your MySQL supports Window function, You can simply use -

SELECT ID, p_name
FROM (SELECT ID, p_name, RANK() OVER(PARTITION BY ID ORDER BY CNT DESC) RNK
      FROM (SELECT ID, 
                   p_name,
                   COUNT(*) CNT
            FROM prescription
            GROUP BY ID, 
                     p_name
           ) T1
     ) T2
WHERE RNK = 1

edited Nov 08 '19 at 19:52

answered Nov 08 '19 at 19:47

Ankit Bajpai

13,128
4
25
40

Luuk · Answer 2 · 2019-11-09T11:51:06.110

0

SELECT 
   p1.ID,
   (SELECT TOP 1 p2.p_name 
    FROM prescription AS p2 
    WHERE p2.ID=p1.ID
    GROUP BY p2.p_name
    ORDER BY count(*) DESC) as MostUsed
FROM prescription AS p1
GROUP BY p1.ID

Above is for MSSQL, below is for MySQL

SELECT 
   p1.ID,
   (SELECT p2.p_name 
    FROM prescription AS p2 
    WHERE p2.ID=p1.ID
    GROUP BY p2.p_name
    ORDER BY count(*) DESC
    LIMIT 1) as MostUsed
FROM prescription AS p1
GROUP BY p1.ID

dbfiddle

edited Nov 09 '19 at 11:51

answered Nov 08 '19 at 19:38

Luuk

12,245
5
22
33

Top doesn't support by MySQL. – Ankit Bajpai Nov 08 '19 at 19:41

score 0 · Answer 3 · answered Nov 08 '19 at 19:38

You need FROM ( ) T in this case T is the table name alias for the FROM subquery clause

      FROM   (SELECT DISTINCT p.ID, 
       p.p_name,
       COUNT(*) OVER (PARTITION BY p.p_name, p.ID) AS Cnt
       FROM prescription AS p
       ORDER BY Cnt DESC) as Tabela
    INNER JOIN(
        SELECT Tabela2.ID, MAX(Tabela2.Cnt)
        FROM (SELECT DISTINCT p.ID, 
              p.p_name,
              COUNT(*) OVER (PARTITION BY p.p_name, p.ID) AS Cnt
              FROM prescription AS p
              ORDER BY Cnt DESC) as Tabela2
        GROUP BY Tabela2.ID
    ) T

SQL - Difficult Query

3 Answers3