1

I'm trying to find out max/min/avg of process length/number of steps of each recipe and number of wafers using the recipes. The table is large and is recorded by time unit, so there are lots of repeated data such as recipe or recipe_step. Here is my code, pls help me out:

select distinct 
    recipe as recipe_id, 
    max(pl) as max_process_length, 
    min(pl) as min_process_length, 
    avg(pl) as avg_process_length, 
    max(steps) as max_number_of_steps, 
    min(steps) as min_number_of_steps, 
    avg(steps) as avg_number_of_steps, 
    count(wfr_id) as number_of_wafers
into outfile 'C:/ProgramData/MySQL/MySQL Server 8.0/Uploads/process_id.xls'
from 01m02train 
group by recipe
having (
    max(pl) = (
        select max(pt) 
        from (
            select count(time) as pt 
            from 01m02train 
            group by (recipe,wfr_id)
        )
    ) as `maxpl` 
    and min(pl) = (
        select min(pt) 
        from (
            select count(time) as pt 
            from 01m02train 
            group by (recipe,wfr_id)
        )
    ) as `minpl` 
    and avg(pl) = (
        select avg(pt) 
        from (
            select count(time) as pt 
            from 01m02train 
            group by (recipe,wfr_id)
        )
    ) as `avgpl` 
    and max(steps) = (
        select max(rs) 
        from (
            select distinct count(recipe_step) as rs 
            from 01m02train 
            group by (recipe,wfr_id)
        )
    ) as `maxrs` 
    and min(steps) = (
        select min(rs) 
        from (
            select distinct count(recipe_step) as rs 
            from 01m02train 
            group by (recipe,wfr_id)
        )
    ) as `minrs` 
    and avg(steps) = (
        select avg(rs) 
        from (
            select distinct count(recipe_step) as rs 
            from 01m02train 
            group by (recipe,wfr_id)
        )
    ) as `avgrs`
);
GMB
  • 216,147
  • 25
  • 84
  • 135
  • Simplify the problem (look at [mcve]). Also, well formatted SQL is so much easier to read - and to write. – jarlh Nov 21 '19 at 20:29
  • I reformated your query. As commented by jarlh, please consider doing it yourself next time before posting... – GMB Nov 21 '19 at 20:45

2 Answers2

0

In your having comparing clause you are place too much ( ) and these don't resolve the table name alias for subquery

having ( max(pl)=( <----  this must removed 
    select max(pt) from ( 
        select count(time) as pt 
        from 01m02train 
        group by (recipe,wfr_id)
    ) <---- this must removed 
) as `maxpl` 
and min(pl)=( <----  this must removed 
    select min(pt) from (  
        select count(time) as pt 
        from 01m02train 
        group by (recipe,wfr_id) 
      )<---- this must removed 
  ) as `minpl` and
  ..... 
  ....  

you must removed the corresponding pair of () for the subquery as in sample below

  having ( max(pl)=
    select max(pt) from ( 
        select count(time) as pt 
        from 01m02train 
        group by (recipe,wfr_id)
) as `maxpl` 
and min(pl)=
    select min(pt) from (  <----  this must removed 
        select count(time) as pt 
        from 01m02train 
        group by (recipe,wfr_id) 

) as minpl

ScaisEdge
  • 131,976
  • 10
  • 91
  • 107
0

Primary issue : the error message is quite clear, you need to alias the derived tables that are generated in expressions. On the other hand, aliasing a condition does not make sense. So basically this:

max(pl) = (
    select max(pt) 
    from (
        select count(time) as pt 
        from 01m02train 
        group by (recipe,wfr_id)
    )
) as `maxpl` 

This should be spelled:

max(pl) = (
    select max(pt) 
    from (
        select count(time) as pt 
        from 01m02train 
        group by (recipe,wfr_id)
    ) as `maxpl`
)  

Other issues (they would not generate errors but are still, in my opinion, worth being noticed):

  • no parentheses are needed around the columns enumerated in the group by clause (that's superfluous)

  • same goes for conditions in the having clause (unless you are mixing and and or, which you do not)

  • select distinct with group by does not make sense (both do aggregation, in essence)

Here is an updated version of your query:

select 
    recipe as recipe_id, 
    max(pl) as max_process_length, 
    min(pl) as min_process_length, 
    avg(pl) as avg_process_length, 
    max(steps) as max_number_of_steps, 
    min(steps) as min_number_of_steps, 
    avg(steps) as avg_number_of_steps, 
    count(wfr_id) as number_of_wafers
into outfile 'C:/ProgramData/MySQL/MySQL Server 8.0/Uploads/process_id.xls'
from 01m02train 
group by recipe
having 
    max(pl) = (
        select max(pt) 
        from (
            select count(time) as pt 
            from 01m02train 
            group by (recipe,wfr_id)
        ) as `maxpl` 
    ) 
    and min(pl) = (
        select min(pt) 
        from (
            select count(time) as pt 
            from 01m02train 
            group by (recipe,wfr_id)
        ) as `minpl` 
    ) 
    and avg(pl) = (
        select avg(pt) 
        from (
            select count(time) as pt 
            from 01m02train 
            group by (recipe,wfr_id)
        ) as `avgpl` 
    ) 
    and max(steps) = (
        select max(rs) 
        from (
            select count(recipe_step) as rs 
            from 01m02train 
            group by (recipe,wfr_id)
        ) as `maxrs` 
    ) 
    and min(steps) = (
        select min(rs) 
        from (
            select count(recipe_step) as rs 
            from 01m02train 
            group by (recipe,wfr_id)
        ) as `minrs`
    )  
    and avg(steps) = (
        select avg(rs) 
        from (
            select count(recipe_step) as rs 
            from 01m02train 
            group by (recipe,wfr_id)
        ) as `avgrs`
    ) 
;
GMB
  • 216,147
  • 25
  • 84
  • 135