1
SELECT * FROM or_mail
GROUP BY campaign_id
HAVING date_time = MAX(date_time);

SELECT campaign_id, date_time FROM or_mail
GROUP BY campaign_id
HAVING date_time = MAX(date_time);

The 1st query returns 13 records. The 2nd returns 35.

Why are records missing from the first query!? Why should what I'm selecting matter at all?

Bill Karwin
  • 538,548
  • 86
  • 673
  • 828
Farzher
  • 13,934
  • 21
  • 69
  • 100
  • 2
    `HAVING date_time = MAX(date_time)` doesn't mean anything in SQL. It's nonsense. You seem to be trying to do a [tag:greatest-n-per-group] query, so go find one of the many solutions posted on StackOverflow for that type of query. – Bill Karwin Mar 14 '14 at 21:45
  • @BillKarwin how is that a duplicate question. I think my query should be valid, and I really want to know why it doesn't work. Because I think my query is much easier to read, I don't want to use JOINS if I don't have to. – Farzher Mar 14 '14 at 21:51
  • Think of it this way: in a language like PHP or Java or whatever, suppose I write a comparison like `if (arrayvar = max(arrayvar)`. Is this true or false? It's true for a given element of the array, but not for the *whole* array. The comparison doesn't make sense. Likewise it doesn't make sense to do this in SQL, because the `HAVING` clause applies to groups, not to rows. – Bill Karwin Mar 15 '14 at 22:41

2 Answers2

2

This is your query:

SELECT campaign_id, date_time
FROM or_mail
GROUP BY campaign_id
HAVING date_time = MAX(date_time);

You are aggregating by campaign_id. That means that the results will have one row per campaign_id. What date_time goes on the row? Well, an arbitrary value from one of the matching rows. Just one value, an arbitrary one. The same is true of the having clause. In other words, the query does not do what you expect it to do.

Whether you know it or not, you are using a group by extension that is particular to MySQL (you can read the documentation here). The documentation specifically warns against using the extension this say. (There would be no problem if date_time were the same on all rows with the same campaign_id, but that is not the case.)

The following is the query that you actually want:

SELECT campaign_id, date_time
FROM or_mail om
WHERE not exists (select 1
                  from or_mail om2
                  where om2.campaign_id = om.campaign_id and
                        om2.date_time > date_time
                 );

What this says is: Return results from all rows in or_mail with the property that there is no larger date_time with the same campaign_id. HAVING date_time = MAX(date_time);

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
0

It shouldn't... Did you wait a while before running second query? If so, then a bunch of records could have been created.

cainz
  • 79
  • 1
  • 6
  • I'm absolutely sure it's the query that's doing this, I've tested a lot of variations. If I have to, I'll build a test using sqlfiddle or something. – Farzher Mar 14 '14 at 21:44