0

As a beginner with SQL, I’m ok to do simple tasks but I’m struggling right now with multiple nested queries. My problem is that I have 3 tables like this:

a Case table:

id   nd    date                  username
--------------------------------------------
1    596   2016-02-09 16:50:03   UserA
2    967   2015-10-09 21:12:23   UserB
3    967   2015-10-09 22:35:40   UserA
4    967   2015-10-09 23:50:31   UserB
5    580   2017-02-09 10:19:43   UserA

a Value table:

case_id   labelValue_id    Value          Type
-------------------------------------------------
1         3633             2731858342     X
1         124              ["864","862"]  X
1         8981             -2.103         X
1         27               443            X
...       ...              ...            ...
2         7890             232478         X
2         765              0.2334         X
...       ...              ...            ...

and a Label table:

id     label
----------------------
3633   Value of W             
124    Value of X   
8981   Value of Y     
27     Value of Z

Obviously, I want to join these tables. So I can do something like this:

SELECT *
from Case, Value, Label
where Case.id= Value.case_id
and Label.id = Value.labelValue_id

but I get pretty much everything whereas I would like to be more specific.

What I want is to do some filtering on the Case table and then use the resulting id's to join the two other tables. I'd like to:

  1. Filter the Case.nd's such that if there is serveral instances of the same nd, take the oldest one,
  2. Limit the number of nd's in the query. For example, I want to be able to join the tables for just 2, 3, 4 etc... different nd.
  3. Use this query to make a join on the Value and Label table.

For example, the output of the queries 1 and 2 would be:

id   nd    date                  username
--------------------------------------------
1    596   2016-02-09 16:50:03   UserA
2    967   2015-10-09 21:12:23   UserB

if I ask for 2 different nd. The nd 967 appears several times but we take the oldest one.

In fact, I think I found out how to do all these things but I can't/don't know how to merge them.

To select the oldest nd, I can do someting like:

select min((date)), nd,id 
    from Case
    group by nd

Then, to limit the number of nd in the output, I found this (based on this and that) :

select *,
@num := if(@type <> t.nd, @num + 1, 1) as row_number,
@type := t.nd as dummy
    from(
        select min((date)), nd,id 
        from Case
        group by nd
        ) as t
        group by t.nd
        having row_number <= 2 -- number of output

It works but I feel it's getting slow.

Finally, when I try to make a join with this subquery and with the two other tables, the processing keeps going on for ever.

During my research, I could find answers for every part of the problem but I can't merge them. Also, for the "counting" problem, where I want to limit the number of nd, I feel it's kind of far-fetch.

I realize this is a long question but I think I miss something and I wanted to give details as much as possible.

Community
  • 1
  • 1
JoKa
  • 1
  • 3

2 Answers2

0

to filter the case table to eliminate all but oldest nds,

select * from [case] c
where date = (Select min(date) from case
              where nd = c.nd)

then just join this to the other tables:

select * from [case] c
   join value v on v.Case_id = c.Id
   join label l on l.Id = v.labelValue_id
where date = (Select min(date) from [case]
              where nd = c.nd)

to limit it to a certain number of records, there is a mysql specific command, I think it called Limit

select * from [case] c
   join value v on v.Case_id = c.Id
   join label l on l.Id = v.labelValue_id
where date = (Select min(date) from [case]
              where nd = c.nd)
Limit 4 -- <=== will limit return result set to 4 rows

if you only want records for the top N values of nd, then the Limit goes on a subquery restricting what values of nd to retrieve:

select * from [case] c
   join value v on v.Case_id = c.Id
   join label l on l.Id = v.labelValue_id
where date = (Select min(date) from [case]
              where nd = c.nd)
   and nd In (select distinct nd from [case]
              order by nd desc Limit N) 
Charles Bretana
  • 143,358
  • 22
  • 150
  • 216
  • Thank you for you answer, the join part helps me. However, for the limit part I think wasn't clear enough. I update the post and the `Value` table. For a `Case.nd` linked with the `Case.id`, I can get like 5000 values in the `Value` table. So, my point is not to limit the number of row (a `Limit 4` allows me to see the first 4 values for a specific nd), but really the number of nd that will be processed. I hope it will make sense. – JoKa Feb 15 '17 at 16:15
  • yes, I understand, then you need another predicate restriction on the case table.... but you have to specify *which* nd values to include then... assuming you want the *highest* n values, I will examine another solution – Charles Bretana Feb 15 '17 at 18:05
  • Thank you! however I get this error "Error Code: 1235. This version of MySQL doesn't yet support 'LIMIT & IN/ALL/ANY/SOME subquery'. I'm on SQL V5.7.17. But apparently, we can use JOIN instead of IN. I'm working on it, I think I'm getting close ! – JoKa Feb 16 '17 at 10:29
  • Ok, I think I got it. I just check everything. – JoKa Feb 16 '17 at 10:43
0

So finally, here is what worked well for me:

select *
from (
        select *
        from Case
        join (
            select nd as T_ND, date as T_date
            from Case
            where nd in (select distinct nd from Case)
            group by T_ND Limit 5 -- <========= Limit of nd's
        ) as t
        on Case.nd = t.T_ND
        where date = (select min(date) 
                                from Case
                                where nd = t.T_ND)
    ) as subquery

    join Value 
        on Value.context_id = subquery.id
    join Label
        on Label.id = Value.labelValue_id

Thank you @charlesbretana for leading me on the right track :).

JoKa
  • 1
  • 3