1

Sorry if this is a basic question.

I have a table

store-ProdCode
13p      I10x   
13p      I20x   
13p      I30x   
14a      K38z    
17a      K38y    

my data set has nearly 100,000 records.

What I'm trying to do is, for every store find the top 10 prodCode.

I am unsure of how to do this but what I tried was:

select s_code as store, prod_code,count (prod_code)
from top10_secondary
where prod_code is not null 
group by store,prod_code
order by count(prod_code) desc limit 10

this is giving me something completely different and i'm unsure on how I go about achieving my final result.

The expected output should be: for every store(s_code) display the top 10 prodcode (top 10 calculated by the count)

so:

store--prodcode--result
1a      abc       5
1a      abd       4 
1a      xxx       7
--- this will be done until top 10 prodcodes for 1a are done--
2a      dgf       1
2a      ldk       6
--process completes until end of data is reached and top 10 prodcodes are displayed for each store

All help is appreciated. What is the best way to do this?

Thanks

timz
  • 23
  • 1
  • 6
  • I don't think the stated "duplicate" is appropriate because this is an aggregation query. – Gordon Linoff Oct 06 '16 at 21:17
  • thanks for the link, but i fail to see how my question is related as im asking something slightly different. Mine needs to be counted first and then be put in order, no? – timz Oct 06 '16 at 21:18

1 Answers1

1

One method uses row_number(), something like this:

select s.*
from (select s_code as store, prod_code, count(prod_code),
             row_number() over (partition by s_code order by count(prod_code) desc) as seqnum
      from top10_secondary
      where prod_code is not null 
      group by s_code, prod_code
     ) s
where seqnum <= 10;

You can use window functions directly in an aggregation query. The subquery is needed only to reference the sequence number for filtering.

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
  • many thanks for this. When i try run it says store does not exist..ERROR: column "store" does not exist.. it is clearly specified, am i missing something – timz Oct 06 '16 at 21:32
  • Once again thanks for this.. Have modified and got it working. is there anyway I can remove seqnum from appearing in the results? – timz Oct 06 '16 at 21:39
  • @timz . . . Just list the columns instead of `t.*`. I use `*` in answers much more often than I do in real life, simply because listing all the columns can be a distraction. – Gordon Linoff Oct 06 '16 at 23:40
  • Use s_code instead of store. The re-labeling from "s_code" to "store" happens at the end of the query, *after* all the searching, comparing, etc is done. – Jason Oct 07 '16 at 00:57
  • @GordonLinoff thanks for your help. Got it all up and running now and the results are appearing as requested. – timz Oct 07 '16 at 08:18