0

PLEASE will someone help? I've put HOURS into this silly, stupid problem. This stackoverview post is EXACTLY my question, and I have tried BOTH suggested solutions to no avail.

Here are MY specifics. I have extracted 4 records from my actual database, and excluded no fields:

master_id   date_sent   type    mailing response
00001   2015-02-28 00:00:00 PHONE   NULL    NULL
00001   2015-03-13 14:45:20 EMAIL   ThankYou.html   NULL
00001   2015-03-13 14:34:43 EMAIL   ThankYou.html   NULL
00001   2015-01-11 00:00:00 EMAIL   KS_PREVIEW  TRUE
00001   2015-03-23 21:42:03 EMAIL   MailChimp Update #2 NULL

(sorry about the alignment of the columns.)

I want to get the most recent mailing and date_sent for each master_id. (My extract is of only one master_id to make this post simple.)

So I run this query:

SELECT master_id,date_sent,mailing
FROM contact_copy
WHERE type="EMAIL"

and get the expected result:

master_id   date_sent   mailing         
1   3/13/2015   14:45:20    ThankYou.html       
1   3/13/2015   14:34:43    ThankYou.html       
1   1/11/2015   0:00:00 KS_PREVIEW      
1   3/23/2015   21:42:03    MailChimp   Update  #2

BUT, when I add this simple aggregation to get the most recent date:

SELECT master_id,max(date_sent),mailing
FROM contact_copy
WHERE type="EMAIL"
group BY master_id
;

I get an UNEXPECTED result:

master_id   max(date_sent)  mailing
00001   2015-03-23 21:42:03 ThankYou.html

So my question: why is it returning the WRONG MAILING?

It's making me nuts! Thanks.

By the way, I'm not a developer, so sorry if I'm breaking some etiquette rule of asking. :)

Community
  • 1
  • 1
SteveS
  • 33
  • 1
  • 8

2 Answers2

2

That's because when you use GROUP BY, all the columns have to be aggregate columns, and mailing is not one of them..

You should use a subquery or a join to make it work

SELECT master_id,date_sent,mailing
FROM contact_copy cc
JOIN 
( SELECT master_id,max(date_sent)
  FROM contact_copy
  WHERE type="EMAIL"
  group BY master_id
 ) result
ON cc.master_id= result.master_id AND cc.date_sent=result.date_sent
JustAPup
  • 1,720
  • 12
  • 19
0

You're getting an "unexpected" result because of a MySQL specific extension to the GROUP BY functionality. The result you're getting is actually expected, according to the MySQL Reference Manual.

Ref: https://dev.mysql.com/doc/refman/5.5/en/group-by-handling.html


Other database engines would reject your query as invalid... an error along the lines of "non-aggregate expressions included in the SELECT list not included in the GROUP BY".)

We can get MySQL to behave like other databases (and return an error for that query) if we include ONLY_FULL_GROUP_BY in the SQL mode.

Ref: https://dev.mysql.com/doc/refman/5.5/en/sql-mode.html#sqlmode_only_full_group_by


To get the result you are looking for...

If the (master_id,type,date_sent) tuple is UNIQUE in contact_copy (that is, if for given values of master_id and type, there will be no "duplicate" values of date_sent), we could use a JOIN operation to retrieve the specified result.

First, we write a query to get the "maximum" date_sent for a given master_id and type. For example:

SELECT mc.master_id
     , mc.type
     , MAX(mc.date_sent) AS max_date_sent
  FROM contact_copy mc
 WHERE mc.master_id = '0001' 
   AND mc.type = 'EMAIL'

To retrieve the entire row associated with that "maximum" date_sent, we can use that query as an inline view. That is, wrap the query text in parens, assign an alias, and then reference that as if it were a table, for example:

SELECT c.master_id
     , c.date_sent
     , c.mailing
  FROM ( SELECT mc.master_id
              , mc.type 
              , MAX(mc.date_sent) AS max_date_sent
           FROM contact_copy mc
          WHERE mc.master_id = '0001' 
            AND mc.type = 'EMAIL'
       ) m
  JOIN contact_copy c
    ON c.master_id = m.master_id
   AND c.type      = m.type
   AND c.date_sent = m.max_date_sent

Note that if there are multiple rows that have the same values of master_id,type and date_sent, there is potential to return more than one row. You could add a LIMIT 1 clause to guarantee that you return only one row; which of those rows is returned is indeterminate, without an ORDER BY clause before the LIMIT clause.

spencer7593
  • 106,611
  • 15
  • 112
  • 140