0

For the past 4 hours I've been laser focused on this one problem, in a nut shell, I want to order this table by id in DESC order, grouped by ads_post_id (in DESC order based on id), with a LIMIT of 6 rows returned.

Sample of database,


id   | ads_post_id
---------------------------------------------------------------------------
22   | 983314845117571
23   | 983314845117571
24   | 983314845117571         
104  | 983314845117571
250  | 983314845117571
253  | 983314845117571 
767  | 983314845117571          
---------------------------------------------------------------------------

My current query,

SELECT * FROM fb_ads GROUP BY ads_post_id ORDER BY id DESC LIMIT 6

However all this returns is,


id   | ads_post_id
---------------------------------------------------------------------------
22   | 983314845117571   
---------------------------------------------------------------------------

It should return,


id    | ads_post_id
---------------------------------------------------------------------------
767   | 983314845117571   
---------------------------------------------------------------------------

So clearly it's been grouped in ASC order and then ordered by ID in DESC order right?

So this has led me down a rabbit hole with research, most people seemed to use this as a work around, but it's not preferable because of the performance hit, this query needs to be recalled every time a user goes onto a next page,

SELECT * FROM 
(
select * from fb_ads order by id desc
) as fb_ads
group by ads_post_id
order by id DESC LIMIT 6

HOWEVER, it still didn't work for me, this only returned,

   ---------------------------------------------------------------------------
    id   | ads_post_id
    ---------------------------------------------------------------------------
    22   | 983314845117571   
    ---------------------------------------------------------------------------

PLEASE NOTE: This is a sample of my database for simplicity of answering, in practice there will be thousands of ads_post_id, so as far as I know at this time MYSQL's MAX() function won't work because it only returns one row.

I'm not an expert in MYSQL, but I know enough to get around, I feel this needs a solution outside my scope of expertise.

Some help would go a very long way, thank you.

Bill Karwin
  • 538,548
  • 86
  • 673
  • 828
Jack Trowbridge
  • 3,175
  • 9
  • 32
  • 56
  • in the future don't spend 4 hours on it before asking – Drew Oct 24 '16 at 05:21
  • What would the desired result look like? What is `LIMIT 6` for? There's only one ads_post_id here so all that stuff about GROUP BY seems irrelevant. – Strawberry Oct 24 '16 at 09:21

4 Answers4

5

You have misunderstood how GROUP BY works in SQL, due to a feature of MySQL. In standard SQL every non aggregate column in the SELECT statement MUST be in the GROUP BY clause (there is an exception for columns whose values are 100% dependent on a column already in the GROUP BY clause, although few flavours of SQL support this exemption).

MySQL does not enforce this by default, but which rows values are used for those columns is not defined. While you might get the one you want, you also might not. And even if you do there is a chance that it will change in the future.

The ordering is independent of the GROUP BY normally, although if you do not specify an ORDER clause then the results will be ordered based on what was required to perform the GROUPing (ie, if it helps to order the rows in one order to do the GROUP BY then MySQL will not bother to reorder the records afterwards unless you specifically tell it to with an ORDER BY clause).

So with your current data, grouping by ads_post_id the value of id that is returned could be 22, 23, 24, 104, 250, 253 or 767. Which one MySQL choses to use is not defined.

With your current data fixing this is trivial as you can just get the MAX id:-

SELECT ads_post_id, MAX(id) 
FROM fb_ads 
GROUP BY ads_post_id 
LIMIT 6

MAX will return 1 row for each GROUPed value.

The normal problem is that people want another column for that row. For example say that each of the rows in your sample data also had an IP address, and you wanted the one that equated to the highest id for the ads_post_id:-

id   | ads_post_id         ip_address
---------------------------------------------------------------------------
22   | 983314845117571     192.168.0.0
23   | 983314845117571     192.168.0.5
24   | 983314845117571     192.168.0.7    
104  | 983314845117571     192.168.0.0
250  | 983314845117571     192.168.0.4
253  | 983314845117571     192.168.0.6
767  | 983314845117571     192.168.0.1     
---------------------------------------------------------------------------

In this case you cannot just use MAX. For example if you tried:-

SELECT ads_post_id, MAX(id), MAX(ip_address) 
FROM fb_ads 
GROUP BY ads_post_id 
LIMIT 6

You would get the following data returned

id   | ads_post_id         ip_address
---------------------------------------------------------------------------
767  | 983314845117571     192.168.0.7     
---------------------------------------------------------------------------

If you tried the following in most flavours of SQL you would get an error. In MySQL with the default settings you would get a result, but which IP address is returned is not defined (and in effect random).

SELECT ads_post_id, MAX(id), ip_address 
FROM fb_ads 
GROUP BY ads_post_id 
LIMIT 6

The solutions to this are either to get the max id for each ads_post_id in a sub query and then joining that back to the table to get the rest of the values:-

SELECT a.ads_post_id,
        a.id,
        a.ip_address
FROM fb_ads a
INNER JOIN
(
    SELECT ads_post_id, MAX(id) AS max_id 
    FROM fb_ads 
    GROUP BY ads_post_id 
) sub0
ON a.ads_post_id = sub0.ads_post_id
AND a.id = sub0.max_id

An alternative is to (ab)use the GROUP_CONCAT aggregate function. GROUP_CONCAT will bring back all the values concatenated together into 1 field, each separated by a , (by default). You can add an ORDER BY clause to force the order they are concatenated into. The you can use SUBSTRING_INDEX to return everything up to the first comma.

This can be useful for simple data, but becomes problematic with text data or fields that max be NULL.

SELECT a.ads_post_id,
        SUBSTRING_INDEX(GROUP_CONCAT(id ORDER BY id DESC), ',', 1),
        SUBSTRING_INDEX(GROUP_CONCAT(ip_address ORDER BY id DESC), ',', 1)
FROM fb_ads 
GROUP BY ads_post_id 
Kickstart
  • 21,403
  • 2
  • 21
  • 33
  • Thanks for that brilliant response, I've went ahead and marked it as the answer as you and it deserves it. I've also added an answer with the exact solution I used, based on the same concepts as you described, but without using a join. Thanks again. – Jack Trowbridge Oct 24 '16 at 11:36
2

You're asking for a limit per group, right? This is not a straightforward task in SQL, so it's no wonder you're having difficulty. It's especially awkward in MySQL, because they lack windowing functions like ROW_NUMBER().

The most common solution for MySQL is to simulate a row-number per group, by incrementing a session variable and resetting to 1 when the group changes value from one row to the next.

SELECT id, ads_post_id
FROM (
    SELECT id, ads_post_id,
      @r := IF(@g=ads_post_id, @r+1, 1) AS row_number,
      @g := ads_post_id
    FROM (SELECT @r:=1, @g:=0) as _init, fb_ads
    ORDER BY ads_post_id, id DESC
) AS t
WHERE t.row_number <= 6;

This type of question comes up frequently, see for example How to SELECT the newest four items per category? which I answered in 2009.

Community
  • 1
  • 1
Bill Karwin
  • 538,548
  • 86
  • 673
  • 828
  • Thanks for the answer Bill, I've went ahead and give your answer a thumbs up, I believe it's on the right lines but if I'm being honest I don't understand the syntax. – Jack Trowbridge Oct 24 '16 at 11:44
  • @JackTrowbridge, start by reading this page: http://dev.mysql.com/doc/refman/5.7/en/user-variables.html – Bill Karwin Oct 24 '16 at 16:52
0

If you want to get max id for each ads_post_id jest get the max(id) with group by ads_post_id instead of ordering.

SELECT max(id), ads_post_id FROM fb_ads GROUP BY ads_post_id LIMIT 6
  • This does not seem to get the results I want, it's not ordering by the most recent id's, this is a must. It must be ordered by ID in DESC order, but grouped by ads_post_id (in DESC based on the DESC ordered ID) – Jack Trowbridge Oct 24 '16 at 05:34
0

@Kickstars answer is very well thought out and answers my question, however I've used a slightly different solution but based on the same concepts.

What I've learn't is to get the results I want, the ORDER BY must be separate from the GROUP BY.

In her example she used a sub query to group ads_post_ids based from the most recent record, and then uses a JOIN to effectively join that data to the rest of the table.

This is using the same concept but without a join, where I'm simply querying data from my main table, but using a WHERE to include my sub query for grouping.

SELECT   *
FROM     fb_ads
WHERE    (id, ads_post_id) IN (
           SELECT   MAX(id), ads_post_id
           FROM     fb_ads
           GROUP BY ads_post_id)
ORDER BY id DESC LIMIT 6
Jack Trowbridge
  • 3,175
  • 9
  • 32
  • 56
  • Hiya. This will work but MySQL can be quite inefficient with using IN . Whether this is an issue will depend on the amount of data the subquery is returning. PS - it is _she_ – Kickstart Oct 24 '16 at 11:42
  • 1
    My apologies! I've edited my answer. I will certainly keep that in mind, right now it's not much of an issue and already I've incorporated this into my PHP and it's used quite a few times, so I'll leave it for now. Thanks again. – Jack Trowbridge Oct 24 '16 at 11:48