2

I have table with events for user, here are my table columns

events (
  id integer,
  user_id integer,
  event text,
  date timestamp
)

and like to delete rows from table when number of events for user are bigger than 100 and start deleting form oldest rows.


update 1

It is for PostgreSQL

I like to delete all events for any user who currently has >100 events starting from oldest events for user. So like to keep 100 newest events for each user

I try with this query but it running forever and don't do anything

delete
from events as t1
where id IN (
select id from events where t1.user_id = user_id order by date desc offset 100 
)

update 2

I have updated my query and let it run

delete from events as t
where t.id not in (
   select t2.id
   from events t2
   where t2.user_id = t.user_id
   order by t2.date desc
   limit 100
);

But it takes 11min to run :( My table had 15 users, 71931 rows, each user by average had 4795 events

broman
  • 45
  • 1
  • 6
  • What implementation of SQL? What types of column? What does "start deleting form oldest rows" mean; why does the order of deletion matter? Do you want to delete all events for any user who currently has >100 events, or just the events that are older than the newest 100 events?Please show example input and output data to make your question less vague. – underscore_d Apr 16 '18 at 10:32
  • *option 1:* create procedure to insert and delete, *option 2:* write trigger to delete after insert. –  Apr 16 '18 at 10:32

3 Answers3

2

In most versions of SQL, you can do something like this:

delete from t
    where t.date < (select t2.date
                    from t t2
                    where t2.user_id = t.user_id
                    order by t2.date desc
                    offset 99 fetch first 1 row only
                   );

Or:

delete from t
    where t.date not in (select t2.date
                         from t t2
                         where t2.user_id = t.user_id
                         order by t2.date desc
                         fetch first 100 row only
                        );
Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
2

If you want to delete the oldest row when the number of events exceeds 100, you can simply use a stored procedure to deal with this condition. A stored procedure can be used simply if it is difficult to compute continually each condition. like below code I would like that you use it as follows.

CREATE PROCEDURE <procedure name>
IF (number of events> = 100)
DELETE FROM events
WHERE id IN (SELECT TOP 1 * id FROM table ORDER BY DueDate)
haramq
  • 161
  • 2
  • 13
1

Use row_number() function first analyze the data and delete them

select * from (
   select *,
        row_number() over (partition by user_id order by date desc) seq 
  from events e
) tt
where seq >= 100

However, the first version has departmental events per user if, you want to analyze all events thenremove partition clause

Yogesh Sharma
  • 49,870
  • 5
  • 26
  • 52