1

I'm collecting transfer data from my router; it supplies daily, monthly, and bi-minutely (an interval of 120 seconds) summaries. If I reboot the router in the middle of a day (and therefore the middle of a month), those reports will be incomplete. But, I'll still have the interval data and can sum the records from before and after the boot.

Previously, I've performed this rollup action with a script after updating the three separate tables. That's slow as I have to query for the sum for a given day of intervals and then the months intervals. It'd be faster to just perform the rollup when the interval is updated. So I put together a trigger. The problem is that the way I've written the trigger, it updates the daily and monthly rows on each interval insert. Ideally, this would only happen once per transaction and process the rows that had just been added.

So, below is a contained example that shows what I have. The count column is just there to illustrate that the rollup occurs more than it needs to. Is there a way to streamline this at all?

BEGIN TRANSACTION;

CREATE TABLE monthly  (count INTEGER DEFAULT 0, date DATE NOT NULL, interface TEXT NOT NULL, upload INTEGER DEFAULT 0, download INTEGER DEFAULT 0, rollup_upload INTEGER DEFAULT 0, rollup_download INTEGER DEFAULT 0, PRIMARY KEY (date, interface));
CREATE TABLE daily    (count INTEGER DEFAULT 0, date DATE NOT NULL, interface TEXT NOT NULL, upload INTEGER DEFAULT 0, download INTEGER DEFAULT 0, rollup_upload INTEGER DEFAULT 0, rollup_download INTEGER DEFAULT 0, PRIMARY KEY (date, interface));
CREATE TABLE interval (count INTEGER DEFAULT 0, date DATE NOT NULL, interface TEXT NOT NULL, upload INTEGER DEFAULT 0, download INTEGER DEFAULT 0, interval INTEGER, PRIMARY KEY (date, interface));

CREATE TRIGGER rollup_interval_trigger AFTER INSERT ON interval
BEGIN

INSERT OR REPLACE INTO daily (count, date, interface, upload, download, rollup_upload, rollup_download)
SELECT
    COALESCE((SELECT count FROM daily WHERE date IS strftime('%Y-%m-%d', NEW.date, 'localtime') AND interface IS 'wan'),0)+1,
    strftime('%Y-%m-%d', NEW.date, 'localtime'),
    'wan',
    COALESCE((SELECT upload FROM daily WHERE date IS strftime('%Y-%m-%d', NEW.date, 'localtime') AND interface IS 'wan'),0),
    COALESCE((SELECT download FROM daily WHERE date IS strftime('%Y-%m-%d', NEW.date, 'localtime') AND interface IS 'wan'),0),
    sum(upload) as rollup_upload,
    sum(download) as rollup_download
FROM interval
WHERE strftime('%Y-%m-%d', date, 'localtime') = strftime('%Y-%m-%d', NEW.date, 'localtime') AND interface IS 'vlan2';

INSERT OR REPLACE INTO monthly (count, date, interface, upload, download, rollup_upload, rollup_download)
SELECT
    COALESCE((SELECT count FROM monthly WHERE date IS strftime('%Y-%m-01', NEW.date, 'localtime') AND interface IS 'wan'),0)+1,
    strftime('%Y-%m-01', NEW.date, 'localtime'),
    'wan',
    COALESCE((SELECT upload FROM monthly WHERE date IS strftime('%Y-%m-01', NEW.date, 'localtime') AND interface IS 'wan'),0),
    COALESCE((SELECT download FROM monthly WHERE date IS strftime('%Y-%m-01', NEW.date, 'localtime') AND interface IS 'wan'),0),
    sum(upload) as rollup_upload,
    sum(download) as rollup_download
FROM interval
WHERE strftime('%Y-%m', date, 'localtime') = strftime('%Y-%m', NEW.date, 'localtime') AND interface IS 'vlan2';

END;

COMMIT;

insert into daily (date, interface, download, upload) values ('2012-10-02', 'wan', 10, 20);
insert into monthly (date, interface, download, upload) values ('2012-10-01', 'wan', 30, 40);

.headers ON

select * from daily;
select * from monthly;

begin transaction;
insert into interval (date, interval, download, upload, interface) values ('2012-10-02 11:00:00', 120, 10, 20, 'vlan2');
insert into interval (date, interval, download, upload, interface) values ('2012-10-02 12:00:00', 120, 10, 20, 'vlan2');
insert into interval (date, interval, download, upload, interface) values ('2012-10-02 13:00:00', 120, 10, 20, 'vlan2');
insert into interval (date, interval, download, upload, interface) values ('2012-10-02 14:00:00', 120, 10, 20, 'vlan2');
insert into interval (date, interval, download, upload, interface) values ('2012-10-01 12:00:00', 120, 10, 20, 'vlan2');
insert into interval (date, interval, download, upload, interface) values ('2012-10-03 12:00:00', 120, 10, 20, 'vlan2');
commit;

select * from interval;

select * from daily;
select * from monthly;
fracai
  • 307
  • 2
  • 14

1 Answers1

1

What about increasing the counts of daily and monthly just by the new numbers just inserted into interval? Then you would not aggregate on every insert. All you have is two lookups of one row by primary key and an update of that row.

CREATE TRIGGER rollup_interval_trigger AFTER INSERT ON interval
BEGIN
    INSERT INTO daily 
    SELECT 0, strftime('%Y-%m-%d', NEW.date, 'localtime'), 'wan',0,0,0,0 
    WHERE NOT EXISTS (
      SELECT 1 
      FROM daily 
      WHERE date = strftime('%Y-%m-%d', NEW.date, 'localtime') and interface = 'wan');

    UPDATE daily 
    SET count = count + 1,
        rollup_upload = rollup_upload + new.upload ,
        rollup_download = rollup_download + new.download 
    WHERE date = strftime('%Y-%m-%d', NEW.date, 'localtime') and interface = 'wan';

    -- and similarly for table monthly
END;

Since sqlite does not have an 'upsert' statement, you would need to have a separate statement to create the daily/monthly records when they are not there yet before you update.

This gives the same results as your trigger for your test data.

PS

You could furthermore replace the INSERT ... WHERE NOT EXISTS by INSERT OR IGNORE for maximum performance if you want (I don't like it too much since it catches all conflicts, not just unique constraints, and might, in general, hide other errors).

Fabian
  • 2,822
  • 1
  • 17
  • 22
  • OK, so instead of summing, just increase the rollup value. I'll have to test this to get an idea on the performance, but I like it. It does have the disadvantage of potentially overcounting if I insert, then remove, and insert the same interval. That's an edge case, unlikely, and could be handled by a delete trigger. I could also have an update trigger that would perform the rollup that I started with. – fracai Dec 04 '13 at 15:13
  • Regarding the UPSERT, that's what my INSERT OR REPLACE is supposed to handle. Is there a difference between that and the INSERT WHERE NOT EXISTS / UPDATE or INSERT OR IGNORE that you suggest? – fracai Dec 04 '13 at 15:14
  • Am I correct in thinking that the UPSERT (INSERT WHEN NOT EXISTS / UPDATE) vs. INSERT OR REPLACE is that the INSERT OR REPLACE requires specifying all fields and thus additional queries, while your suggestion leaves the unchanged data alone? – fracai Dec 04 '13 at 15:21
  • Hah, final question I think. What sort of other conflicts would be caught by the IGNORE? Wrong type? Invalid date? Invalid columns? I might look into this option as well. Specifically, something like ABORT, which I think would backout the current statement, but let the rest of the transaction occur. I think this would also report the error so I know which rows from the interval to reconsider. Thanks. – fracai Dec 04 '13 at 15:29
  • Gah, I don't want INSERT OR ABORT as that wouldn't get to the UPDATE. I'll have to look into the possible conflict cases. – fracai Dec 04 '13 at 15:31
  • Last one, I swear! I'll end up with three triggers, probably more; one for each table. The daily and monthly triggers sum the interval rows to the rollup fields and turn the insert into a replace if the row is already there from an interval rollup. The interval trigger will increment the daily and monthly rollups as in your post. If daily / monthly goes in first, there likely won't be any interval data, but if there is it'll be summed. If the intervals are first, new rows for daily and monthly will be created and then updated when those rows are inserted (triggered to a replace) later. Thanks! – fracai Dec 04 '13 at 16:08
  • Wow, a lot of comments ;) Regarding UPSERT vs INSERT OR REPLACE that is nicely discussed [here](http://stackoverflow.com/questions/418898/sqlite-upsert-not-insert-or-replace/4330694#4330694). It is not the same. Regarding IGNORE: It would also, for example, silently omit the insert if NOT NULL constraints or CHECK constraints are violated. That cannot happen in your case, but this is why I prefer not to use it. – Fabian Dec 04 '13 at 16:37
  • Heh, I kept doing more research :) Thanks for the reminder about that other SO question. I actually used that to put together my original INSERT OR REPLACE statement. I may end up using the IGNORE; as you say, in my case I don't think I need to worry about it. I just need to make sure that my parser is generating the right data. Thanks again. – fracai Dec 04 '13 at 17:32