0

I have a table of orders for an epos system.

I have several years worth of data and I'd like to find out the average number of transactions per day of the week each year. i.e. which is the busiest day of the week for 2014 vs 2013 etc.

I can get the info out this way:

SELECT ROUND(AVG(sales),1) AS dayaverage, oday, oyear FROM (
SELECT COUNT(oid) AS sales, DAYNAME(odate) AS oday , YEAR(odate) AS oyear
FROM orders 
WHERE ostatus = 'completed'
GROUP BY DATE(odate)
ORDER BY YEAR(odate), DAYOFWEEK(odate)) t1
GROUP BY oday, oyear
ORDER BY oyear DESC, AVG(sales) DESC

Which yeilds the data a want, but not quite in the right format.

I get this:

"dayaverage"    "oday"      "oyear"
"28.9"          "Saturday"  "2015"
"17.1"          "Sunday"    "2015"
"15.0"          "Tuesday"   "2015"
"14.3"          "Monday"    "2015"
"13.1"          "Wednesday" "2015"
"13.0"          "Friday"    "2015"

I have created this statement:

SELECT tsun.pyear, sunday,monday,tuesday, wednesday, thursday, friday, saturday FROM (
SELECT ROUND(AVG(payments),1) AS Sunday, pyear FROM (
SELECT COUNT(paymentid) AS payments, YEAR(DATETIME) AS pyear
FROM payments 
WHERE DAYNAME(DATETIME) = "Sunday"
GROUP BY DATE(DATETIME)
ORDER BY YEAR(DATETIME), DAYOFWEEK(DATETIME)) t1
GROUP BY pyear
ORDER BY pyear DESC, AVG(payments) DESC) AS tsun
LEFT JOIN (SELECT * FROM (SELECT ROUND(AVG(payments),1) AS Monday, pyear FROM (
SELECT COUNT(paymentid) AS payments, YEAR(DATETIME) AS pyear
FROM payments 
WHERE DAYNAME(DATETIME) = "Monday"
GROUP BY DATE(DATETIME)
ORDER BY YEAR(DATETIME), DAYOFWEEK(DATETIME)) t2
GROUP BY pyear
ORDER BY pyear DESC, AVG(payments) DESC) AS tmon) tmon2 ON tsun.pyear = tmon2.pyear
LEFT JOIN (SELECT * FROM (SELECT ROUND(AVG(payments),1) AS Tuesday, pyear FROM (
SELECT COUNT(paymentid) AS payments, YEAR(DATETIME) AS pyear
FROM payments 
WHERE DAYNAME(DATETIME) = "Tuesday"
GROUP BY DATE(DATETIME)
ORDER BY YEAR(DATETIME), DAYOFWEEK(DATETIME)) t2
GROUP BY pyear
ORDER BY pyear DESC, AVG(payments) DESC) AS ttue) ttue2 ON tsun.pyear = ttue2.pyear
LEFT JOIN (SELECT * FROM (SELECT ROUND(AVG(payments),1) AS Wednesday, pyear FROM (
SELECT COUNT(paymentid) AS payments, YEAR(DATETIME) AS pyear
FROM payments 
WHERE DAYNAME(DATETIME) = "Wednesday"
GROUP BY DATE(DATETIME)
ORDER BY YEAR(DATETIME), DAYOFWEEK(DATETIME)) t2
GROUP BY pyear
ORDER BY pyear DESC, AVG(payments) DESC) AS twed) twed2 ON tsun.pyear = twed2.pyear
LEFT JOIN (SELECT * FROM (SELECT ROUND(AVG(payments),1) AS Thursday, pyear FROM (
SELECT COUNT(paymentid) AS payments, YEAR(DATETIME) AS pyear
FROM payments 
WHERE DAYNAME(DATETIME) = "Thursday"
GROUP BY DATE(DATETIME)
ORDER BY YEAR(DATETIME), DAYOFWEEK(DATETIME)) t2
GROUP BY pyear
ORDER BY pyear DESC, AVG(payments) DESC) AS tthu) tthu2 ON tsun.pyear = tthu2.pyear
LEFT JOIN (SELECT * FROM (SELECT ROUND(AVG(payments),1) AS Friday, pyear FROM (
SELECT COUNT(paymentid) AS payments, YEAR(DATETIME) AS pyear
FROM payments 
WHERE DAYNAME(DATETIME) = "Friday"
GROUP BY DATE(DATETIME)
ORDER BY YEAR(DATETIME), DAYOFWEEK(DATETIME)) t2
GROUP BY pyear
ORDER BY pyear DESC, AVG(payments) DESC) AS tfri) tfri2 ON tsun.pyear = tfri2.pyear
LEFT JOIN (SELECT * FROM (SELECT ROUND(AVG(payments),1) AS Saturday, pyear FROM (
SELECT COUNT(paymentid) AS payments, YEAR(DATETIME) AS pyear
FROM payments 
WHERE DAYNAME(DATETIME) = "Saturday"
GROUP BY DATE(DATETIME)
ORDER BY YEAR(DATETIME), DAYOFWEEK(DATETIME)) t2
GROUP BY pyear
ORDER BY pyear DESC, AVG(payments) DESC) AS tsat) tsat2 ON tsun.pyear = tsat2.pyear

Which also yields the data I want in the format I want;

"pyear" "sunday"    "monday"    "tuesday"   "wednesday" "thursday"  "friday"    "saturday"
"2015"  "18.4"  "13.0"  "16.5"  "14.3"  "13.3"  "17.3"  "31.3"
"2014"  "17.8"  "19.0"  "17.6"  "15.3"  "15.5"  "20.2"  "32.0"
"2013"  "3.4"   "3.9"   "3.4"   "3.7"   "3.4"   "4.5"   "6.2"
"2012"  "2.8"   "4.8"   "4.7"   "4.8"   "3.7"   "5.7"   "7.1"
"2011"  "4.0"   "7.1"   "5.6"   "6.2"   "6.6"   "5.4"   "6.2"
"2010"  "3.0"   "5.5"   "5.7"   "5.2"   "5.3"   "4.6"   "6.6"
"2009"  "2.5"   "4.3"   "3.5"   "4.9"   "4.8"   "2.9"   "3.9"

But it's not very elegant is it!

What I would like to know is if there is a better way to do it?

Is there something like the first statement that will yield something like the 2nd one?

I'm sure there is a clever way to do it, like using a case statement but the solution escapes me at the moment. Anyone got any ideas?

Thanks in advance,

Marc


So I got to this, thanks Gordon.

SELECT oyear,
       ROUND(AVG(CASE WHEN oday = 'Monday' THEN sales END), 1) AS Mon,
       ROUND(AVG(CASE WHEN oday = 'Tuesday' THEN sales END), 1) AS Tue,
       ROUND(AVG(CASE WHEN oday = 'Wednesday' THEN sales END), 1) AS Wed,
       ROUND(AVG(CASE WHEN oday = 'Thursday' THEN sales END), 1) AS Thu,
       ROUND(AVG(CASE WHEN oday = 'Friday' THEN sales END), 1) AS Fri,
       ROUND(AVG(CASE WHEN oday = 'Saturday' THEN sales END), 1) AS Sat,
       ROUND(AVG(CASE WHEN oday = 'Sunday' THEN sales END), 1) AS Sun
FROM (SELECT COUNT(oid) AS sales, DAYNAME(odate) AS oday , YEAR(odate) AS oyear
      FROM orders 
      WHERE ostatus = 'completed'
      GROUP BY DATE(odate)
     ) d
GROUP BY oyear
ORDER BY oyear ASC
  • what sort of average? mean, mode or median? – Martin Mar 05 '15 at 13:52
  • Your looking for a mysql pivot table http://stackoverflow.com/questions/7674786/mysql-pivot-table – exussum Mar 05 '15 at 14:00
  • Why not pivot the output from your first query within your application code? It's a pretty trivial task. – eggyal Mar 05 '15 at 14:22
  • Note that none of these approaches gives a true average, since any days on which there are no completed orders will be excluded from the calculation. Imagine, for example, that there was only one Sunday in 2014 on which any orders were completed—but 400 orders were completed that day: the average for Sunday will be 400, yet there were actually 51 other Sundays for which there are no records (and which should, therefore, have been treated as zeroes for the purposes of averaging—but have been excluded by these queries). – eggyal Mar 05 '15 at 14:23
  • You can get a reasonably good proxy to the correct answer by taking the total number of orders and dividing by 52 (on the basis that there are pretty much 52 weeks in a year): of course, that leaves one day (two in a leap year) for which the average will be slightly (circa 2%) overstated, since there will in fact have been 53 of those days during the year under consideration. To be more accurate, you will need to calculate exactly how many of each weekday there have been—it's not too difficult, but does require a little thought. – eggyal Mar 05 '15 at 14:29
  • For example: `COUNT(*)/((DAYOFYEAR('2014-12-31')+WEEKDAY(DATEDIFF('2014-01-01' - INTERVAL ? DAY)) DIV 7)` where `?` represents the day of week under consideration, 1=Monday ... 7=Sunday. – eggyal Mar 05 '15 at 14:59
  • @eggyal I could well do it in the application cade, no problem, like you said a trivial task. It's as much about finding an elegant solution, as I have other applications for an approach like this, and hopefully it will be less processor intensive. I could be wrong, and have been in the past! I'd just like to do it all in mysql that's all. – Marc McHale Mar 05 '15 at 17:27
  • @MarcMcHale: Pivoting data in this way is really just a presentational matter; as such, the most "elegant" solution is to handle it in the presentation layer—not in the database layer. – eggyal Mar 05 '15 at 17:42

1 Answers1

2

Use conditional aggregation:

SELECT ROUND(AVG(sales),1) AS dayaverage, oday, oyear,
       ROUND(AVG(CASE WHEN oday = 'Monday' THEN sales END), 1) as Mon,
       ROUND(AVG(CASE WHEN oday = 'Tuesday' THEN sales END), 1) as Tue,
       ROUND(AVG(CASE WHEN oday = 'Wednesday' THEN sales END), 1) as Wed,
       ROUND(AVG(CASE WHEN oday = 'Thursday' THEN sales END), 1) as Thu,
       ROUND(AVG(CASE WHEN oday = 'Friday' THEN sales END), 1) as Fri,
       ROUND(AVG(CASE WHEN oday = 'Saturday' THEN sales END), 1) as Sat,
       ROUND(AVG(CASE WHEN oday = 'Sun' THEN sales END), 1) as Sun
FROM (SELECT COUNT(oid) AS sales, DAYNAME(odate) AS oday , YEAR(odate) AS oyear
      FROM orders 
      WHERE ostatus = 'completed'
      GROUP BY DATE(odate)
     ) d
GROUP BY oyear
ORDER BY DESC;
Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
  • Should `oday` really be included in the outer select statement? – jpw Mar 05 '15 at 14:05
  • Now that's what I was after! Needed a bit of tweaking. But the numbers are different between your query and my original query! – Marc McHale Mar 05 '15 at 17:38
  • Sorry, I was having a senior moment. I put it against the payments table, not the orders table! Of course they would be different! Now I put it against the orders table the results are the same. Spot on. Thanks! I have definitely learnt something today. Much appreciated. – Marc McHale Mar 05 '15 at 17:47