I've greatly simplified the examples to hopefully produce a clear enough question that can be answered:
Consider a table of events
CREATE TABLE alertable_events
(
unique_id text NOT NULL DEFAULT ''::text,
generated_on timestamp without time zone NOT NULL DEFAULT now(),
message_text text NOT NULL DEFAULT ''::text,
CONSTRAINT pk_alertable_events PRIMARY KEY (unique_id),
)
with the following data:
COPY alertable_events (unique_id,message_text,generated_on) FROM stdin;
one message one 2014-03-20 06:00:00.000000
two message two 2014-03-21 06:00:00.000000
three message three 2014-03-22 06:00:00.000000
four message four 2014-03-23 06:00:00.000000
five message five 2014-03-24 06:00:00.000000
\.
And for each event, there is a list of fields
CREATE TABLE alertable_event_fields
(
unique_id text NOT NULL DEFAULT ''::text,
field_name text NOT NULL,
field_value text NOT NULL DEFAULT ''::text,
CONSTRAINT pk_alertable_event_fields PRIMARY KEY (unique_id, field_name),
CONSTRAINT fk_alertable_event_fields_0 FOREIGN KEY (unique_id)
REFERENCES alertable_events (unique_id) MATCH SIMPLE
ON UPDATE CASCADE ON DELETE CASCADE,
)
with the following data:
COPY alertable_event_fields (unique_id,field_name,field_value) FROM stdin;
one field1 a
one field2 b
two field1 z
two field2 y
three field1 a
three field2 m
four field1 a
four field2 b
five field1 z
five field2 y
\.
I want to define a view that produces the following:
| unique_id | fields | message_text | generated_on | updated_on | count |
| five | z|y | message five | 2014-03-21 06:00:00.000000 | 2014-03-24 06:00:00.000000 | 2 |
| four | a|b | message four | 2014-03-20 06:00:00.000000 | 2014-03-23 06:00:00.000000 | 2 |
| three | a|m | message three | 2014-03-22 06:00:00.000000 | 2014-03-22 06:00:00.000000 | 1 |
Notably:
- fields is a pipe delimited string (or any serialization of) the field values (json encoding of field_name:field_value pairs would be even better ... but I can work with pipe_delim for now)
- the output is grouped by matching fields. Update 3/30 12:45am The values are ordered by their field_name's alphabetically therefore
a|b
would not matchb|a
- a count is produced of the events that match that field set. updated 3/30 12:45am there can be different number of fields per unique_id, a match requires matching all fields and not a subset of the fields.
- generated_on is the timestamp of the first event
- updated_on is the timestamp of the most recent event
- message_text is the message_text of the most recent event
I've produced this view, and it works for small data sets, however, as the alertable_events table grows, it becomes exceptionally slow. I can only assume I'm doing something wrong in the view because I have never dealt with anything quite so ugly.
Update 3/30 12:15PM EDT It looks like I may have server tuning problems causing this high run-times, see added explain
for more info. If you see a glaring issue there, I'd be greatly interested in tweaking the server's configuration.
Can anyone piece together a view that handles large datasets well and has a significantly better run time than this? Perhaps using hstore? (I'm running 9.2 preferrably, though 9.3 if I can have a nice json encoding of the fields.)
Updated 3/30 11:30AM I'm beginning to think my issue may be server tuning (which means I'll need to talk to the SA) Here's a very simple explain (analyze,buffers)
which is showing a ridiculous run-time for as few as 8k rows in the unduplicated_event_fields
Update 3/30 7:20PM I bumped my available memory to 5MB using SET WORK_MEM='5MB'
(which is plenty for the query below), strangely, even though the planner went to in memory quicksort, it actually took on average 100ms longer!
explain (analyze,buffers)
SELECT a.unique_id,
array_to_string(array_agg(a.field_value order by a.field_name),'|') AS "values"
FROM alertable_event_fields a
GROUP BY a.unique_id;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------
GroupAggregate (cost=771.11..892.79 rows=4056 width=80) (actual time=588.679..630.989 rows=4056 loops=1)
Buffers: shared hit=143, temp read=90 written=90
-> Sort (cost=771.11..791.39 rows=8112 width=80) (actual time=588.591..592.622 rows=8112 loops=1)
Sort Key: unique_id
Sort Method: external merge Disk: 712kB
Buffers: shared hit=143, temp read=90 written=90
-> Seq Scan on alertable_event_fields a (cost=0.00..244.40 rows=8112 width=80) (actual time=0.018..5.478 rows=8112 loops=1)
Filter: (message_name = 'LIMIT_STATUS'::text)
Buffers: shared hit=143
Total runtime: 632.323 ms
(10 rows)
Update 3/30 4:10AM EDT I'm still not completely satisfied and would be interested in any further optimization. I have a requirement to support 500msgs/sec steady state, and although most of those should not be "events", I get a little backlogged right now when stress testing.
Update 3/30 12:00PM EDT Here's my most readable iteration yet, unfortunately, for 4000 rows I'm still looking at 600ms runtimes! ... (see above, as its mostly contained with the inner most query) Any help here would be greatly appreciated
CREATE OR REPLACE VIEW views.unduplicated_events AS
SELECT a.unique_id,a.message_text,
b."values",b.generated_on,b.updated_on,b.count
FROM alertable_events a
JOIN (
SELECT b."values",
min(a.generated_on) AS generated_on,
max(a.generated_on) AS updated_on,
count(*) AS count
FROM alertable_events a
JOIN (
SELECT a.unique_id,
array_to_string(array_agg(a.field_value order by a.field_name),'|') AS "values"
FROM alertable_event_fields a
GROUP BY a.unique_id
) b USING (unique_id)
GROUP BY b."values"
) b ON a.generated_on=b.updated_on
ORDER BY updated_on DESC;
Update 3/30 12:00PM EDT removed old stuff as this is getting too long