3

I'm building an activity stream, so old messages fall off the end and will never get seen again. So, I'd like to regularly delete old messages. However, I don't want it to look like a user has no activity, so I want to leave a minimum of X posts in each stream.

Simplified, my "messages" table has a "mid" (message ID, primary key), a "uid" (user ID), a "created" date (UNIX timestamp), and a "message." So what I want is to do something like this:

DELETE FROM messages
WHERE created < ? AND mid NOT IN (
    SELECT mid FROM messages m GROUP BY uid HAVING mid > (
        SELECT mid FROM messages WHERE uid = m.uid ORDER BY mid LIMIT 1 OFFSET ?
    )
)

However, when I try the query inside NOT IN() by itself, it returns no results. Additionally, I don't think it works to SELECT from the same table you're deleting from.

I'm okay with 2 queries if necessary, but I hope it isn't necessary.

I tried searching around and there are a bunch of results on how to delete all but the last record, but I didn't find anything about deleting all but the last N records.

IceCreamYou
  • 1,852
  • 14
  • 25

2 Answers2

2

Look at this question for a [mysql] solution to the "top n per group" problem:

How to SELECT the newest four items per category?

Once you have the set consisting of all rows in the "top n per group" (that's what you want to keep). You just have to delete those rows not in that set:

delete x
where x.mid not in ( select mid
                     -- where top n per group criteria
                   )

Here's another link that looks at the same problem:

http://www.xaprb.com/blog/2006/12/07/how-to-select-the-firstleastmax-row-per-group-in-sql/

Community
  • 1
  • 1
Nicholas Carey
  • 71,308
  • 16
  • 93
  • 135
1

This simulates sql servers windowed function row number it numbers each message by date. Then only deletes messages with a number greater than 10. You might need to flip the equality between l.created >= r.created.

delete from messages
where mid in 
    (select  l.mid from (  
    select l.mid,l.created, count(*) as num
    from messages as l
    left outer join messages as r
        on l.mid = r.mid
        and l.created >= r.created 
    group by l.mid, l.created) a 
    where a.num > 10    
   and l.created < ?);
JStead
  • 1,710
  • 11
  • 12