0

Last week I asked a question and incorporated https://stackoverflow.com/a/39149076/6489232.

Below I have put the full query, which does work. But it is incredibly slow (2 minutes for a single date range on the largest database, 20 seconds on the smallest), which is to be expected when doing it the awkward way I have, with a triple select.

Select Sources,
Description,
Start,
Series,
Circulation,
Advert,
COG,
COF,
Orders,
ResponseP,
OrderValue,
AOV,
Marketing,
Ins,
New,
truncate((OrderValue + Ins) - (COG + COF + Marketing + Advert),2) as Profit,
truncate(((OrderValue + Ins) - (COG + COF + Marketing + Advert))/New,2) as CPN 
from
(select Sources as Sources,
 Description as Description,
 Start as Start,
 Series as Series,
 Circulation as Circulation,
 Advert as Advert,
 sum(i.itmcos) as COG,
 truncate(sum(k.stkffc)*i.itmmul,2) as COF,
 max(Orders) as Orders,
 ResponseP as ResponseP,
 max(OrderValue) as OrderValue,
 AOV as AOV,
Marketing as Marketing, 
Ins as Ins,
 New as New
  from (
        select s.soucod as Sources,
      s.soudes as Description,
      o.ordurn as ord,
      s.souexp as Ins,
      s.soudts as Start,
      s.soucir as Circulation,
      s.souser as Series,
      s.souavc as Advert, 
      count(o.ordurn) as Orders,
      sum(o.ordval) as OrderValue,
      truncate((count(o.ordurn)/s.soucir)*100,2) as ResponseP,
      truncate(sum(o.ordval)/count(o.ordurn),2) as AOV,
      truncate(sum(CASE WHEN o.ordpot = 'F' THEN o.ordval ELSE 0 END)*0.32,2) as Marketing,
      sum(CASE WHEN o.CCSCRD = m.CCSCRD then 1 else 0 end) as New
          from source s
          join orders o on o.ordsou = s.soucod
          join member m on o.ORDMEM = m.MEMKEY
      where o.ordrdt = '2016-08-30'
         group by s.soucod
       ) A
   join items i on i.ITmsou = A.Sources
   join stock k on i.itmstk = k.stkcod
 where i.itmrdt = '2016-08-30'
  group by A.Sources) B
  group by B.Sources

I had to add the third "Select" because I couldnt put the fields I defined in the second Select into a calculation in that same select. Below is what I am looking to do to possibly speed this up.

select Sources,
 Description,
 Start,
 Series,
 Circulation,
 Advert,
 sum(i.itmcos) as COG,
 truncate(sum(k.stkffc)*i.itmmul,2) as COF,
 max(Orders) as Orders,
 ResponseP,
 max(OrderValue) as OrderValue,
 AOV,
Marketing, 
Ins,
 New,
 truncate((OrderValue + Ins) - (COG + COF + Marketing + Advert),2) as Profit,
truncate(((OrderValue + Ins) - (COG + COF + Marketing + Advert))/New,2) as CPN 
  from (
        select s.soucod as Sources,
      s.soudes as Description,
      o.ordurn as ord,
      s.souexp as Ins,
      s.soudts as Start,
      s.soucir as Circulation,
      s.souser as Series,
      s.souavc as Advert, 
      count(o.ordurn) as Orders,
      sum(o.ordval) as OrderValue,
      truncate((count(o.ordurn)/s.soucir)*100,2) as ResponseP,
      truncate(sum(o.ordval)/count(o.ordurn),2) as AOV,
      truncate(sum(CASE WHEN o.ordpot = 'F' THEN o.ordval ELSE 0 END)*0.32,2) as Marketing,
      sum(CASE WHEN o.CCSCRD = m.CCSCRD then 1 else 0 end) as New
          from source s
          join orders o on o.ordsou = s.soucod
          join member m on o.ORDMEM = m.MEMKEY
      where o.ordrdt = '2016-08-30'
         group by s.soucod
       ) A
   join items i on i.ITmsou = A.Sources
   join stock k on i.itmstk = k.stkcod
 where i.itmrdt = '2016-08-30'
  group by A.Sources

But this fails due to unknown field, obviously the COF, COG etc fields that were created in that select. So I added the third select to do this a stage higher. Is there a way of doing this without the third select being needed?

Community
  • 1
  • 1
  • If you're using a recent MySQL version, you would probably gain from using generated columns : https://dev.mysql.com/doc/refman/5.7/en/create-table-generated-columns.html. That could avoid some embedded SELECTs. – SolarBear Aug 31 '16 at 14:28
  • indexes is always the first place to look at efficiency. Do you have them, are they on the most useful columns? – Martin Aug 31 '16 at 14:43
  • I'm not wading through all that. Good luck! – Strawberry Sep 01 '16 at 09:13
  • However, the first step is to review all of your tables, and decide what the PRIMARY KEY is in each case. Then review again to see if there's any redundancy. For instance, "22544548" always appears in combination with "ADVERT". So that's a potential redundancy. – Strawberry Sep 01 '16 at 09:26

1 Answers1

-1

I can suggest - in a first step - to create a SQL "view":

CREATE VIEW view_name AS
select s.soucod as Sources,
s.soudes as Descrption,
o.ordurn as ord, (etc...),
o.ordrdt -- add this column intending to use it in where clause later
FROM source s JOIN orders o ON o.ordsou = s.soucod
JOIN member m ON o.ORDMEM = m.MEMKEY
GROUP BY s.soucod;

You can look at this link to know how SQL views can raise performances: MySql views performance

That supposes that all your database must be properly indexed.

Then, your final query could look like this:

SELECT sources, description (etc...)
FROM view_name A
JOIN items i ON i.ITmsou = A.Sources
JOIN stock k ON i.itmstk = k.stkcod
WHERE i.itmrdt = '2016-08-30' AND A.ordrdt = i.itmrdt
GROUP BY A.Sources;

That will execute a "GROUP BY" just one time in that query. The first "GROUP BY" is executed beforehand by the view. Many GROUP BY with "COUNT, SUM..." and all calculated statements into a single query are often slow.

Community
  • 1
  • 1
FragBis
  • 73
  • 6
  • 1
    While this will help with code readability, I do not believe it will help with performance. The engine still has to generate the views as temp tables before it can join them. This may result in reduced performance as the engine isn't able to parse the entire query at once and may lose out on efficiency gains it may otherwise be able to derive from one query. – xQbert Aug 31 '16 at 15:14
  • Yes, the first time the query will be executed, the "temp" table generated will take a little much more time to be generated but as the cache is used, that strategy helps me to get real performance raising. – FragBis Aug 31 '16 at 15:17
  • I'm sure of that, because I'm working right now to a query (in MySQL workbench) that takes 24 seconds with 2 SELECTs into parenthesis Vs 6-7 seconds with the same SELECTs into Views. Then, it is x4 faster in my case. – FragBis Aug 31 '16 at 15:38
  • More precisely, look at that link: http://stackoverflow.com/questions/439056/is-a-view-faster-than-a-simple-query I really agree the top answer – FragBis Sep 01 '16 at 09:14
  • @FragBis The EXPLAIN might be more informative. I suspect that the query cache is befuddling the result – Strawberry Sep 01 '16 at 09:15
  • @strawberry: your suspicion are just suspicions. I give links that you can read and that I won't paraphrase (too long answers are unreadable). I can't provide the exact performance improvement for Jack Bloomfield's issue because I'm not connected to his database, and I won't rewrite the absolute query he needs for his own work. Using good INDEXES and VIEWS can really help performance improvements when there are many JOINS and aggregated data. It also helps readability. AND views can be used as objects in POO. So many advantages. – FragBis Sep 01 '16 at 09:30
  • It can also help to handle security and rights to tables (that you access only via views)... Disadvantages are: more memory spaces used, too many views, functions and procedures can result on too complicated databases but here it is a question of management. – FragBis Sep 01 '16 at 09:36
  • if it is an INDEXED (aka materialized) view (this aspect was/is unclear in your answer), [which is a completely different animal than a view](http://aboutsqlserver.com/2011/03/24/indexed-materialized-views-in-microsoft-sql-server/) then I could say yes the performance gain is potentially there. However, without knowing how often the underlying tables change data, the indexed view may actually be more of a burden in the [form of locks and when aggregation is in use](https://www.simple-talk.com/sql/learn-sql-server/sql-server-indexed-views-the-basics/). – xQbert Sep 01 '16 at 12:49
  • I always use basic SQL views and most of time, it improves performance. Sometimes REALLY highly. Sometimes, the gain is not so great (most of time because tuning the query in a simplest logical way by reducing the joins and/or by executing some PHP code to organize my results: only the context of the problem can tell you that), but I never experienced a performance loss by using views instead of SELECTS into parenthesis. Maybe it could, but then, in very few context that for this problems, it's worth it to test views creation and use it. – FragBis Sep 01 '16 at 13:20
  • @xQbert: Only Jack Bloomfield can tell me that for his issue, if he tested it. Not you. I wonder why he doesn't give any feed back – FragBis Sep 01 '16 at 13:21
  • I never use views only to replace a single "SELECT id FROM table_name". It is recommended for nested queries. And using indexed views can come in a second step. – FragBis Sep 01 '16 at 13:28
  • But for MySQL, there is no "indexed (materialized) views" from my knowledge – FragBis Sep 01 '16 at 16:13