0

I have two select statements that take values from two different tables. the select statements look something like this

SELECT year(timestamp) y, month(timestamp) m, count(id) c
FROM table
WHERE clause="foo"
GROUP BY year(timestamp), month(timestamp)

which returns something like

|-y--|-m|c|
|2013|01|9|
|2013|02|9|
|2013|03|9|
|2013|04|9|

.

SELECT year(timestamp) y, month(timestamp) m, count(id) c
FROM table2
WHERE clause="foo"
GROUP BY year(timestamp), month(timestamp)

which returns something like

|-y--|-m|c|
|2013|01|1|
|2013|03|1|
|2013|04|1|

What I'm looking for is joining the two tables based on the timestamp, and subtract the second table from the first. So it should look like:

|-y--|-m|c|
|2013|01|8|
|2013|02|9|
|2013|03|8|
|2013|04|8|

thanks!

phz
  • 4,043
  • 3
  • 15
  • 17

2 Answers2

1

Assuming TableA has your master records, then something like this should work:

SELECT t.y, t.m, t.c - IFNULL(t2.c,0) c
FROM
(
    SELECT year(timestamp) y, month(timestamp) m, count(id) c
    FROM table
    WHERE clause="foo"
    GROUP BY year(timestamp), month(timestamp)
) t
LEFT JOIN (
    SELECT year(timestamp) y, month(timestamp) m, count(id) c
    FROM table2
    WHERE clause="foo"
    GROUP BY year(timestamp), month(timestamp)
) t2 ON t.y = t2.y AND t.m = t2.m
UNION 
SELECT t.y, t.m, (t.c * -1) c
FROM
(
    SELECT year(timestamp) y, month(timestamp) m, count(id) c
    FROM table2
    WHERE clause="foo"
    GROUP BY year(timestamp), month(timestamp)
) t
LEFT JOIN (
    SELECT year(timestamp) y, month(timestamp) m, count(id) c
    FROM table
    WHERE clause="foo"
    GROUP BY year(timestamp), month(timestamp)
) t2 ON t.y = t2.y AND t.m = t2.m
WHERE t2.y IS NULL
ORDER BY y, m

I've updated the answer. The first part of the union returns all records in table with or without those matching in table2, subtracting the table2 count when available. The second part of the union returns all records in table2 that do not exist in table multiplying the count column by -1.

There may be a simpler approach, but without really understanding the tables, this is the best I can offer. And unfortunately MySQL doesn't support FULL OUTER JOINs which would alleviate the need for the UNION.

sgeddes
  • 62,311
  • 6
  • 61
  • 83
  • yeah upon thinking about it a bit more I just realized that some values in table2 may not exist in table1. In this case it needs to be a negative value. how do I go about using a ``union`` in this case? – phz Feb 12 '13 at 22:35
  • @user1484186 -- ok, no prob -- give me a sec and I'll update the answer :) – sgeddes Feb 12 '13 at 22:36
  • why the union; just use a full outer join. and when the evaluate for nulls to 0 in table a – xQbert Feb 12 '13 at 22:47
  • @xQbert -- am I mistaken that MySQL does not support the FULL OUTER JOIN? – sgeddes Feb 12 '13 at 22:50
  • @xQbert -- some more info: http://stackoverflow.com/questions/4796872/full-outer-join-in-mysql -- I completely agree though (as I wrote in my last paragraph above), a full outer join would make this much easier... – sgeddes Feb 12 '13 at 22:51
  • you have a little pesky period mark at the end in ``t.c.`` but other than that it worked great! thanks! – phz Feb 12 '13 at 22:58
  • @user1484186 -- those pesky little things! Nice catch -- edited my answer. Glad I could help :) – sgeddes Feb 12 '13 at 22:59
  • @user1484186 -- btw if this answer did help you, consider marking it as accepted for future users of the site. no worries either way though. best of luck. – sgeddes Feb 12 '13 at 23:29
  • you're right mysql doesn't support full outer joins. This is the accepted norm. Sorry my oracle exuberance got the better of me. – xQbert Feb 13 '13 at 02:56
0

You can use derived tables:

SELECT t1.y, t2.m, t1.c - t2.c AS c
FROM 
  (SELECT year(timestamp) y, month(timestamp) m, count(id) c
   FROM table
   WHERE clause='foo'
   GROUP BY year(timestamp), month(timestamp)) AS t1
LEFT OUTER JOIN
  (SELECT year(timestamp) y, month(timestamp) m, count(id) c
   FROM table2
   WHERE clause='foo'
   GROUP BY year(timestamp), month(timestamp)) AS t2
ON (t1.y, t1.m) = (t2.y, t2.m)

Another solution would be to join the tables and count the distinct rows from each. Assuming id is a primary key column in each table, this could work:

SELECT YEAR(table.timestamp) AS y, MONTH(table.timestamp) AS m,
 COUNT(DISTINCT table.id) - COUNT(DISTINCT table2.id) AS c
FROM table
LEFT OUTER JOIN table2 ON table2.clause = 'foo' AND
  EXTRACT(YEAR_MONTH FROM table.timestamp) =
  EXTRACT(YEAR_MONTH FROM table2.timestamp)
WHERE table.clause='foo'
GROUP BY EXTRACT(YEAR_MONTH FROM table.timestamp)
Bill Karwin
  • 538,548
  • 86
  • 673
  • 828