1

I need to query data from multiple tables, below are the major tables(simplified).

Project
+-----+-------+-------+
| pid | pname | status|      //status: 0 = pending, 1 = complete
+-----+-------+-------+
|  1  | Proj1 |   0   |
|  2  | Proj2 |   1   |
|  3  | Proj3 |   0   |
+-----+-------+-------+

Module
+-----+--------+-------+----------+-----------------+
| mid |   pid  | status| priority |modulecategoryid |
+-----+--------+-------+----------+-----------------+    
|  1  |    1   |   1   |    1     |      1          |  
|  2  |    1   |   0   |    2     |      3          |
|  3  |    3   |   1   |    1     |      1          |
|  4  |    3   |   0   |    2     |      3          |
|  5  |    3   |   0   |    3     |      5          |
+-----+--------+-------+----------+-----------------+

Task
+----+--------+-------+----------+-----------------+
| id |   mid  | status| priority | taskcategoryid  |
+----+--------+-------+----------+-----------------+
| 1  |    2   |   1   |    2     |      2          |
| 2  |    2   |   0   |    1     |      1          |
| 3  |    4   |   1   |    1     |      2          |
| 4  |    4   |   1   |    2     |      3          |
| 5  |    4   |   0   |    3     |      4          |
| 6  |    5   |   0   |    1     |      1          |
+----+--------+-------+----------+-----------------+

I am trying to get the pending tasks for all the pending projects that can be started first based on the module priority and task priority. i.e. for Proj3, module with priority 1 is completed so i should get first priority pending task for module 2.

I need to get the most prior task for each pending project with modulecategoryid and taskcategoryid for get its related info like this

+-----+--------+-----+------------------+----------------+
| pid |   mid  | tid | modulecategoryid | taskcategoryid |
+-----+--------+-----+------------------+----------------+
|  1  |    2   |  2  |         3        |      2         | 
|  2  |    4   |  5  |         3        |      4         | 
+----+---------+-----+------------------+----------------+

I am new to MySql and I have tried query with multiple joins and group it by projectids and min(priority) to get desired result. But columns that are not in group by are fetched randomly from the aggregate.

I have seen this answer SQL Select only rows with Max Value on a Column but that solves the problem for data in only one table.

Shall I get some help on that? I can post my query if needed but it is getting wrong data.

Community
  • 1
  • 1
Mukesh Modhvadiya
  • 2,178
  • 2
  • 27
  • 32
  • If you're still struggling, see http://meta.stackoverflow.com/questions/333952/why-should-i-provide-an-mcve-for-what-seems-to-me-to-be-a-very-simple-sql-query – Strawberry Dec 22 '16 at 12:46

1 Answers1

1

SQL Select only rows with Max Value on a Column has the right approach. You just need to do it twice.

First create a subquery a showing the highest priority task for each module.

Then create a subquery b showing the highest priority Module for each project.

Then join your three tables and two subqueries together.

Here's a. It shows the highest priority Task id for each Module mid. (http://sqlfiddle.com/#!9/7eb1f3/4/0)

SELECT Task.id, Task.mid
  FROM Task
  JOIN (
        SELECT MAX(priority) priority,
               mid
          FROM Task
         WHERE status = 0
             GROUP BY mid
        ) q ON q.priority = Task.priority AND q.mid = Task.mid

Here's b. It works the same way as a and shows the highest priority Module mid for each Project pid. (http://sqlfiddle.com/#!9/7eb1f3/3/0)

SELECT Module.mid, Module.pid
  FROM Module
  JOIN (
        SELECT MAX(priority) priority,
               pid
          FROM Module
         WHERE status = 0
             GROUP BY pid
        ) q ON q.priority = Module.priority AND q.pid = Module.pid

Then you need a big JOIN to pull everything together. In outline it looks like this.

SELECT Project.pid, Project.pname, 
       Module.mid, Task.id tid,
       Module.modulecategoryid, Task.taskcategoryid
  FROM Project
  JOIN (  /* the subquery called b */ 
       ) b ON Project.pid = b.pid 
  JOIN Module ON b.mid = Module.mid
  JOIN (  /* the subquery called a */
       ) a ON Module.mid = a.mid
  JOIN Task ON a.id = Task.id       
 WHERE Task.status = 0

The actual query looks like this, with the subqueries put in. (http://sqlfiddle.com/#!9/7eb1f3/2/0)

SELECT Project.pid, Project.pname, 
       Module.mid, Task.id tid,
       Module.modulecategoryid, Task.taskcategoryid
  FROM Project
  JOIN (   
          SELECT Module.mid, Module.pid
            FROM Module
            JOIN (
                  SELECT MAX(priority) priority, pid
                    FROM Module
                   WHERE status = 0
                   GROUP BY pid
                 ) q   ON q.priority = Module.priority
                      AND q.pid = Module.pid
       ) b ON Project.pid = b.pid 
  JOIN Module ON b.mid = Module.mid
  JOIN ( 
          SELECT Task.id, Task.mid
            FROM Task
            JOIN (
               SELECT MAX(priority) priority, mid
                      FROM Task
                      WHERE status = 0
                      GROUP BY mid
                 ) q    ON q.priority = Task.priority
                       AND q.mid = Task.mid
       ) a ON Module.mid = a.mid
  JOIN Task ON a.id = Task.id       
 WHERE Task.status = 0

The secret to this is understanding that subqueries are virtual tables that you can join to each other or to ordinary tables. The skill you need is sorting out the combination of physical and virtual tables you need, and the join sequence.

Community
  • 1
  • 1
O. Jones
  • 103,626
  • 17
  • 118
  • 172
  • Oh, I am sorry. I missed something in your question. You'll need to use `MIN(priority)` where I used `MAX(priority)` – O. Jones Dec 22 '16 at 13:13
  • Hi @O. Jones, I am really thankful for your efforts to help! I was facing problem when joining the results of module and tasks, but this really looks good. This was simplified schema, so I will give it a try and get back. – Mukesh Modhvadiya Dec 22 '16 at 13:20
  • I have 7 tables to join for the result so I will try it later, but I can see in sqlfiddle this is working as I expected. Such a good answer with explaination. Thanks a lot! +1 & accepted. – Mukesh Modhvadiya Dec 22 '16 at 14:18
  • You're welcome. And welcome to the international association of writers of complex SQL queries. :-) – O. Jones Dec 22 '16 at 14:54