16

The Problem:

I have time-related data in my database and I am struggling to organize, structure and index that data in a way so that users can retrieve it efficiently; even simple database queries take longer than acceptable.

Project Context:

While this is a pure database question, some context might help to understand the data model:

The project centers around doing research on a big, complex machine. I don't know a lot about the machine itself, but rumour in the lab has it there's a flux capacitor in there somewhere - and I think yesterday, I spotted the tail of Schrödinger's cat hanging out of it at the side ;-)

We measure many different parameters while the machine is running using sensors positioned all over the machine at different measurement points (so-called spots) at certain intervals over a period of time. We use not only one device to measure these parameters, but a whole range of them; they differ in the quality of their measurement data (I think this involves sample rates, sensor quality, price and many other aspects that I'm not concerned with); one aim of the project actually is to establish a comparison between these devices. You can visualize these measurement devices as a bunch of lab trolleys, each with a lot of cables connected to the machine, each delivering measurement data.

The Data Model:

There is measurement data from every spot and every device for every parameter, for example once a minute over a period of 6 days. My job is to store that data in a database and to provide efficient access to it.

In a nutshell:

  • a device has a unique name
  • a parameter also has a name; they're not unique though, so it also has an ID
  • a spot has an ID

The project database is more complex of course, but these details don't seem relevant to the issue.

  • a measurement data index has an ID, a time stamp for when the measurement was done and references to the device and the spot on which the measurement was carried out
  • a measurement data value has a reference to the parameter and to the value that was actually measured

Initially, I had modeled the measurement data value to have its own ID as primary key; the n:m relationship between measurement data index and value was a separate table that only stored index:value ID pairs, but as that table itself consumed quite a lot of harddrive space, we eliminated it and changed the value ID to be a simple integer that stores the ID of the measurement data index it belongs to; the primary key of the measurement data value is now composed of that ID and the parameter ID.

On a side note: When I created the data model, I carefully followed common design guidelines like 3NF and appropriate table constraints (such as unique keys); another rule of thumb was to create an index for every foreign key. I have a suspicion that the deviation in the measurement data index / value tables from 'strict' 3NF might be one of the reasons for the performance issues I am looking at now, but changing the data model back has not solved the problem.

The Data Model in DDL:

NOTE: There is an update to this code further below.

The script below creates the database and all tables involved. Please note that there are no explicit indexes yet. Before you run this, please make sure you don't happen to already have a database called so_test with any valuable data...

\c postgres
DROP DATABASE IF EXISTS so_test;
CREATE DATABASE so_test;
\c so_test

CREATE TABLE device
(
  name VARCHAR(16) NOT NULL,
  CONSTRAINT device_pk PRIMARY KEY (name)
);

CREATE TABLE parameter
(
  -- must have ID as names are not unique
  id SERIAL,
  name VARCHAR(64) NOT NULL,
  CONSTRAINT parameter_pk PRIMARY KEY (id)
);

CREATE TABLE spot
(
  id SERIAL,
  CONSTRAINT spot_pk PRIMARY KEY (id)
);

CREATE TABLE measurement_data_index
(
  id SERIAL,
  fk_device_name VARCHAR(16) NOT NULL,
  fk_spot_id INTEGER NOT NULL,
  t_stamp TIMESTAMP NOT NULL,
  CONSTRAINT measurement_pk PRIMARY KEY (id),
  CONSTRAINT measurement_data_index_fk_2_device FOREIGN KEY (fk_device_name)
    REFERENCES device (name) MATCH FULL
    ON UPDATE NO ACTION ON DELETE NO ACTION,
  CONSTRAINT measurement_data_index_fk_2_spot FOREIGN KEY (fk_spot_id)
    REFERENCES spot (id) MATCH FULL
    ON UPDATE NO ACTION ON DELETE NO ACTION,
  CONSTRAINT measurement_data_index_uk_all_cols UNIQUE (fk_device_name, fk_spot_id, t_stamp)
);

CREATE TABLE measurement_data_value
(
  id INTEGER NOT NULL,
  fk_parameter_id INTEGER NOT NULL,
  value VARCHAR(16) NOT NULL,
  CONSTRAINT measurement_data_value_pk PRIMARY KEY (id, fk_parameter_id),
  CONSTRAINT measurement_data_value_fk_2_parameter FOREIGN KEY (fk_parameter_id)
    REFERENCES parameter (id) MATCH FULL
    ON UPDATE NO ACTION ON DELETE NO ACTION
);

I have also created a script to fill the table with some test data:

CREATE OR REPLACE FUNCTION insert_data()
RETURNS VOID
LANGUAGE plpgsql
AS
$BODY$
  DECLARE
    t_stamp  TIMESTAMP := '2012-01-01 00:00:00';
    index_id INTEGER;
    param_id INTEGER;
    dev_name VARCHAR(16);
    value    VARCHAR(16);
  BEGIN
    FOR dev IN 1..5
    LOOP
      INSERT INTO device (name) VALUES ('dev_' || to_char(dev, 'FM00'));
    END LOOP;
    FOR param IN 1..20
    LOOP
      INSERT INTO parameter (name) VALUES ('param_' || to_char(param, 'FM00'));
    END LOOP;
    FOR spot IN 1..10
    LOOP
      INSERT INTO spot (id) VALUES (spot);
    END LOOP;

    WHILE t_stamp < '2012-01-07 00:00:00'
    LOOP
      FOR dev IN 1..5
      LOOP
        dev_name := 'dev_' || to_char(dev, 'FM00');
        FOR spot IN 1..10
        LOOP
          INSERT INTO measurement_data_index
            (fk_device_name, fk_spot_id, t_stamp)
            VALUES (dev_name, spot, t_stamp) RETURNING id INTO index_id;
          FOR param IN 1..20
          LOOP
            SELECT id INTO param_id FROM parameter
              WHERE name = 'param_' || to_char(param, 'FM00');
            value := 'd'  || to_char(dev,   'FM00')
                  || '_s' || to_char(spot,  'FM00')
                  || '_p' || to_char(param, 'FM00');
            INSERT INTO measurement_data_value (id, fk_parameter_id, value)
              VALUES (index_id, param_id, value);
          END LOOP;
        END LOOP;
      END LOOP;
      t_stamp := t_stamp + '1 minute'::INTERVAL;
    END LOOP;

  END;
$BODY$;

SELECT insert_data();

The PostgreSQL query planner requires up to date statistics, so analyze all tables. Vacuuming might not be required, but do it anyway:

VACUUM ANALYZE device;
VACUUM ANALYZE measurement_data_index;
VACUUM ANALYZE measurement_data_value;
VACUUM ANALYZE parameter;
VACUUM ANALYZE spot;

A Sample Query:

If I now run a really simple query to e.g. obtain all value for a certain parameter, it already takes a couple of seconds, although the database is not very large yet:

EXPLAIN (ANALYZE ON, BUFFERS ON)
SELECT measurement_data_value.value
  FROM measurement_data_value, parameter
 WHERE measurement_data_value.fk_parameter_id = parameter.id
   AND parameter.name = 'param_01';

Exemplary result on my development machine (please see below for some details on my environment):

                                                                QUERY PLAN                                                                
------------------------------------------------------------------------------------------------------------------------------------------
 Hash Join  (cost=1.26..178153.26 rows=432000 width=12) (actual time=0.046..2281.281 rows=432000 loops=1)
   Hash Cond: (measurement_data_value.fk_parameter_id = parameter.id)
   Buffers: shared hit=55035
   ->  Seq Scan on measurement_data_value  (cost=0.00..141432.00 rows=8640000 width=16) (actual time=0.004..963.999 rows=8640000 loops=1)
         Buffers: shared hit=55032
   ->  Hash  (cost=1.25..1.25 rows=1 width=4) (actual time=0.010..0.010 rows=1 loops=1)
         Buckets: 1024  Batches: 1  Memory Usage: 1kB
         Buffers: shared hit=1
         ->  Seq Scan on parameter  (cost=0.00..1.25 rows=1 width=4) (actual time=0.004..0.008 rows=1 loops=1)
               Filter: ((name)::text = 'param_01'::text)
               Buffers: shared hit=1
 Total runtime: 2313.615 ms
(12 rows)

There are no indexes in the database apart from the implicit ones, so it's not surprising the planner does sequential scans only. If I follow what seems to be a rule of thumb and add btree indexes for every foreign key like

CREATE INDEX measurement_data_index_idx_fk_device_name
    ON measurement_data_index (fk_device_name);
CREATE INDEX measurement_data_index_idx_fk_spot_id
    ON measurement_data_index (fk_spot_id);
CREATE INDEX measurement_data_value_idx_fk_parameter_id
    ON measurement_data_value (fk_parameter_id);

then do another vacuum analyze (just to be safe) and re-run the query, the planner uses bitmap heap and bitmap index scans and the total query time somewhat improves:

                                                                                   QUERY PLAN                                                                                   
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=8089.19..72842.42 rows=431999 width=12) (actual time=66.773..1336.517 rows=432000 loops=1)
   Buffers: shared hit=55033 read=1184
   ->  Seq Scan on parameter  (cost=0.00..1.25 rows=1 width=4) (actual time=0.005..0.012 rows=1 loops=1)
         Filter: ((name)::text = 'param_01'::text)
         Buffers: shared hit=1
   ->  Bitmap Heap Scan on measurement_data_value  (cost=8089.19..67441.18 rows=431999 width=16) (actual time=66.762..1237.488 rows=432000 loops=1)
         Recheck Cond: (fk_parameter_id = parameter.id)
         Buffers: shared hit=55032 read=1184
         ->  Bitmap Index Scan on measurement_data_value_idx_fk_parameter_id  (cost=0.00..7981.19 rows=431999 width=0) (actual time=65.222..65.222 rows=432000 loops=1)
               Index Cond: (fk_parameter_id = parameter.id)
               Buffers: shared read=1184
 Total runtime: 1371.716 ms
(12 rows)

However, this is still more than a second of execution time for a really simple query.

What I have done so far:

  • got myself a copy of PostgreSQL 9.0 High Performance - great book!
  • did some basic PostgreSQL server configuration, see environment below
  • created a framework to run a series of performance tests using real queries from the project and to display the results graphically; these queries use devices, spots, parameters and a time interval as input parameters and the test series run over e.g. 5, 10 devices, 5, 10 spots, 5, 10, 15, 20 parameters and 1..7 days. The basic result is that they're all too slow, but their query plan was way too complex for me to understand, so I went back to the really simple query used above.

I have looked into partitioning the value table. The data is time-related and partitioning seems an appropriate means to organize that kind of data; even the examples in the PostgreSQL documentation use something similar. However, I read in the same article:

The benefits will normally be worthwhile only when a table would otherwise be very large. The exact point at which a table will benefit from partitioning depends on the application, although a rule of thumb is that the size of the table should exceed the physical memory of the database server.

The entire test database is less than 1GB in size and I am running my tests on a development machine with 8GB of RAM and on a virtual machine with 1GB (see also environment below), so the table is far from being very large or even exceeding the physical memory. I might implement partitioning anyway at some stage, but I have a feeling that approach does not target the performance problem itself.

Furthermore, I am considering to cluster the value table. I dislike the fact that clustering must be re-done whenever new data is inserted and that it furthermore requires an exclusive read/write lock, but looking at this SO question, it seems that it anyway has its benefits and might be an option. However, clustering is done on an index and as there are up to 4 selection criteria going into a query (devices, spots, parameters and time), I would have to create clusters for all of them - which in turn gives me the impression that I'm simply not creating the right indexes...

My Environment:

  • development is taking place on a MacBook Pro (mid-2009) with a dual-core CPU and 8GB of RAM
  • I am running database performance tests on a virtual Debian 6.0 machine with 1GB of RAM, hosted on the MBP
  • PostgreSQL version is 9.1 as that was the latest version when I installed it, upgrading to 9.2 would be possible
  • I have changed shared_buffers from the default 1600kB to 25% of RAM on both machines as recommended in the PostgreSQL docs (which involved enlarging kernel settings like SHMALL, SHMMAX, etc.)
  • similarly, I have changed effective_cache_size from the default 128MB to 50% of the RAM available
  • I ran performance test with different work_mem settings, but did not see any major difference in performance

NOTE: One aspect that I believe is important is that the performance test series with real queries from the project do not differ performance-wise between the MacBook with 8GB and the virtual machine with 1GB; i.e. if a query takes 10s on the MacBook, it also takes 10s on the VM. Also, I ran the same performance tests before and after changing shared_buffers, effective_cache_size and work_mem and the configuration changes did not improve performance by more than 10%; some results in fact even got worse, so it seems any difference is caused rather by test variation than by configuration change. These observations lead me to believe that RAM and postgres.conf settings are not the limiting factors here yet.

My Questions:

I don't know if different or additional indexes would speed up the query and if they did, which ones to create. Looking at the size of the database and how simple my query is, I have the impression there is something fundamentally wrong about my data model or how I have chosen my indexes so far.

Does anyone have some advice for me how to structure and index time-related my to improve query performance ?

Asked more broadly, is tuning query performance

  • usually done 'on an incident base', i.e. once a query does not perform satisfactorily ? It seems all my queries are too slow...
  • mainly a question of looking at (and understanding) query plans, then adding indexes and measuring if things improved, possibly accelerating the process by applying one's experience ?

How do I get this database to fly ?


Update 01:

Looking at the responses so far, I think I have not explained the need for measurement data index / values tables properly, so let me try again. Storage space is the issue here.

NOTE:

  • the figures used here are more of illustrative purpose and meant for comparison only, i.e. the numbers themselves are not relevant, what matters is the percental difference in storage requirements between using a single table vs. using an index and a value table
  • PostgreSQL data type storage sizes are documented in this chapter
  • this makes no claim to be scientifically correct, e.g. the units are probably mathematical bogus; the numbers should add up though

Assuming

  • 1 day of measurements
  • 1 set of measurements per minute
  • 10 devices
  • 10 parameters
  • 10 spots

This adds up to

1 meas/min x 60 min/hour x 24 hour/day = 1440 meas/day

Each measurement has data from every spot and every device for every parameter, so

10 spots x 10 devices x 10 parameters = 1000 data sets/meas

So in total

1440 meas/day x 1000 data sets/meas = 1 440 000 data sets/day

If we store all measurements in a single table as Catcall suggested, e.g.

CREATE TABLE measurement_data
(
  device_name character varying(16) NOT NULL,
  spot_id integer NOT NULL,
  parameter_id integer NOT NULL,
  t_stamp timestamp without time zone NOT NULL,
  value character varying(16) NOT NULL,
  -- constraints...
);

a single row would add up to

17 + 4 + 4 + 8 + 17 = 50 bytes/row

in the worst case where all varchar fields are fully filled. This amounts to

50 bytes/row x 1 440 000 rows/day = 72 000 000 bytes/day

or ~69 MB per day.

While this does not sound a lot, the storage space requirement in the real database would be prohibitive (again, the numbers used here are only for illustration). We have therefore split measurement data into an index and a value table as explained earlier in the question:

CREATE TABLE measurement_data_index
(
  id SERIAL,
  fk_device_name VARCHAR(16) NOT NULL,
  fk_spot_id INTEGER NOT NULL,
  t_stamp TIMESTAMP NOT NULL,
  -- constraints...
);

CREATE TABLE measurement_data_value
(
  id INTEGER NOT NULL,
  fk_parameter_id INTEGER NOT NULL,
  value VARCHAR(16) NOT NULL,
  -- constraints...
);

where the ID of a value row is equal to the ID of the index it belongs to.

The sizes of a row in the index and value tables are

index: 4 + 17 + 4 + 8 = 33 bytes
value: 4 + 4 + 17     = 25 bytes

(again, worst case scenario). The total amount of rows is

index: 10 devices x 10 spots x 1440 meas/day =   144 000 rows/day
value: 10 parameters x 144 000 rows/day      = 1 440 000 rows/day

so the total is

index: 33 bytes/row x   144 000 rows/day =  4 752 000 bytes/day
value: 25 bytes/row x 1 440 000 rows/day = 36 000 000 bytes/day
total:                                   = 40 752 000 bytes/day

or ~39 MB per day - as opposed to ~69 MB for a single table solution.


Update 02 (re: wildplassers response):

This question is getting pretty long as it is, so I was considering updating the code in place in the original question above, but I think it might help to have both the first and the improved solutions in here to better see the differences.

Changes compared to the original approach (somewhat in order of importance):

  • swap timestamp and parameter, i.e. move t_stamp field from measurement_data_index table to measurement_data_value and move fk_parameter_id field from value to index table: With this change, all fields in the index table are constant and new measurement data is written to the value table only. I did not expect any major query performance improvement from this (I was wrong), but I feel it makes the measurement data index concept clearer. While it requires fractionally more storage space (according to some rather crude estimate), having a 'static' index table might also help in deployment when tablespaces are moved to different harddrives according to their read/write requirements.
  • use a surrogate key in device table: From what I understand, a surrogate key is a primary key that is not strictly required from a database design point of view (e.g. device name is already unique, so it could also serve as PK), but might help to improve query performance. I added it because again, I feel it makes the concept clearer if the index table references IDs only (instead of some names and some IDs).
  • rewrite insert_data(): Use generate_series() instead of nested FOR loops; makes the code much 'snappier'.
  • As a side effect of these changes, inserting test data takes only about 50% of the time required by the first solution.
  • I did not add the view as wildplasser suggested; there's no backward compatibility required.
  • Additional indexes for the FKs in the index table seem to be ignored by the query planner and have no impact on query plan or performance.

(it seems without this line, the code below is not properly displayed as code on the SO page...)

\c postgres
DROP DATABASE IF EXISTS so_test_03;
CREATE DATABASE so_test_03;
\c so_test_03

CREATE TABLE device
(
  id SERIAL,
  name VARCHAR(16) NOT NULL,
  CONSTRAINT device_pk PRIMARY KEY (id),
  CONSTRAINT device_uk_name UNIQUE (name)
);

CREATE TABLE parameter
(
  id SERIAL,
  name VARCHAR(64) NOT NULL,
  CONSTRAINT parameter_pk PRIMARY KEY (id)
);

CREATE TABLE spot
(
  id SERIAL,
  name VARCHAR(16) NOT NULL,
  CONSTRAINT spot_pk PRIMARY KEY (id)
);

CREATE TABLE measurement_data_index
(
  id SERIAL,
  fk_device_id    INTEGER NOT NULL,
  fk_parameter_id INTEGER NOT NULL,
  fk_spot_id      INTEGER NOT NULL,
  CONSTRAINT measurement_pk PRIMARY KEY (id),
  CONSTRAINT measurement_data_index_fk_2_device FOREIGN KEY (fk_device_id)
    REFERENCES device (id) MATCH FULL
    ON UPDATE NO ACTION ON DELETE NO ACTION,
  CONSTRAINT measurement_data_index_fk_2_parameter FOREIGN KEY (fk_parameter_id)
    REFERENCES parameter (id) MATCH FULL
    ON UPDATE NO ACTION ON DELETE NO ACTION,
  CONSTRAINT measurement_data_index_fk_2_spot FOREIGN KEY (fk_spot_id)
    REFERENCES spot (id) MATCH FULL
    ON UPDATE NO ACTION ON DELETE NO ACTION,
  CONSTRAINT measurement_data_index_uk_all_cols UNIQUE (fk_device_id, fk_parameter_id, fk_spot_id)
);

CREATE TABLE measurement_data_value
(
  id INTEGER NOT NULL,
  t_stamp TIMESTAMP NOT NULL,
  value VARCHAR(16) NOT NULL,
  -- NOTE: inverse field order compared to wildplassers version
  CONSTRAINT measurement_data_value_pk PRIMARY KEY (id, t_stamp),
  CONSTRAINT measurement_data_value_fk_2_index FOREIGN KEY (id)
    REFERENCES measurement_data_index (id) MATCH FULL
    ON UPDATE NO ACTION ON DELETE NO ACTION
);

CREATE OR REPLACE FUNCTION insert_data()
RETURNS VOID
LANGUAGE plpgsql
AS
$BODY$
  BEGIN
    INSERT INTO device (name)
    SELECT 'dev_' || to_char(item, 'FM00')
    FROM generate_series(1, 5) item;

    INSERT INTO parameter (name)
    SELECT 'param_' || to_char(item, 'FM00')
    FROM generate_series(1, 20) item;

    INSERT INTO spot (name)
    SELECT 'spot_' || to_char(item, 'FM00')
    FROM generate_series(1, 10) item;

    INSERT INTO measurement_data_index (fk_device_id, fk_parameter_id, fk_spot_id)
    SELECT device.id, parameter.id, spot.id
    FROM device, parameter, spot;

    INSERT INTO measurement_data_value(id, t_stamp, value)
    SELECT index.id,
           item,
           'd'  || to_char(index.fk_device_id,    'FM00') ||
           '_s' || to_char(index.fk_spot_id,      'FM00') ||
           '_p' || to_char(index.fk_parameter_id, 'FM00')
    FROM measurement_data_index index,
         generate_series('2012-01-01 00:00:00', '2012-01-06 23:59:59', interval '1 min') item;
  END;
$BODY$;

SELECT insert_data();

At some stage, I will change my own conventions to using inline PRIMARY KEY and REFERENCES statements instead of explicit CONSTRAINTs; for the moment, I think keeping this the way it was makes it easier to compare the two solutions.

Don't forget to update statistics for the query planner:

VACUUM ANALYZE device;
VACUUM ANALYZE measurement_data_index;
VACUUM ANALYZE measurement_data_value;
VACUUM ANALYZE parameter;
VACUUM ANALYZE spot;

Run a query that should produce the same result as the one in the first approach:

EXPLAIN (ANALYZE ON, BUFFERS ON)
SELECT measurement_data_value.value
  FROM measurement_data_index,
       measurement_data_value,
       parameter
 WHERE measurement_data_index.fk_parameter_id = parameter.id
   AND measurement_data_index.id = measurement_data_value.id
   AND parameter.name = 'param_01';

Result:

Nested Loop  (cost=0.00..34218.28 rows=431998 width=12) (actual time=0.026..696.349 rows=432000 loops=1)
  Buffers: shared hit=435332
  ->  Nested Loop  (cost=0.00..29.75 rows=50 width=4) (actual time=0.012..0.453 rows=50 loops=1)
        Join Filter: (measurement_data_index.fk_parameter_id = parameter.id)
        Buffers: shared hit=7
        ->  Seq Scan on parameter  (cost=0.00..1.25 rows=1 width=4) (actual time=0.005..0.010 rows=1 loops=1)
              Filter: ((name)::text = 'param_01'::text)
              Buffers: shared hit=1
        ->  Seq Scan on measurement_data_index  (cost=0.00..16.00 rows=1000 width=8) (actual time=0.003..0.187 rows=1000 loops=1)
              Buffers: shared hit=6
  ->  Index Scan using measurement_data_value_pk on measurement_data_value  (cost=0.00..575.77 rows=8640 width=16) (actual time=0.013..12.157 rows=8640 loops=50)
        Index Cond: (id = measurement_data_index.id)
        Buffers: shared hit=435325
Total runtime: 726.125 ms

This is almost half of the ~1.3s the first approach required; considering I'm loading 432K rows, it is a result I can live with for the moment.

NOTE: The field order in the value table PK is id, t_stamp; the order in wildplassers response is t_stamp, whw_id. I did this that way because I feel a 'regular' field order is the one in which fields are listed in the table declaration (and 'reverse' is then the other way around), but that's just my own convention that keeps me from getting confused. Either way, as Erwin Brandstetter pointed out, this order is absolutely critical for the performance improvement; if it is the wrong way around (and a reverse index as in wildplassers solution is missing), the query plan looks like below and performance is more than 3 times worse:

Hash Join  (cost=22.14..186671.54 rows=431998 width=12) (actual time=0.460..2570.941 rows=432000 loops=1)
  Hash Cond: (measurement_data_value.id = measurement_data_index.id)
  Buffers: shared hit=63537
  ->  Seq Scan on measurement_data_value  (cost=0.00..149929.58 rows=8639958 width=16) (actual time=0.004..1095.606 rows=8640000 loops=1)
        Buffers: shared hit=63530
  ->  Hash  (cost=21.51..21.51 rows=50 width=4) (actual time=0.446..0.446 rows=50 loops=1)
        Buckets: 1024  Batches: 1  Memory Usage: 2kB
        Buffers: shared hit=7
        ->  Hash Join  (cost=1.26..21.51 rows=50 width=4) (actual time=0.015..0.359 rows=50 loops=1)
              Hash Cond: (measurement_data_index.fk_parameter_id = parameter.id)
              Buffers: shared hit=7
              ->  Seq Scan on measurement_data_index  (cost=0.00..16.00 rows=1000 width=8) (actual time=0.002..0.135 rows=1000 loops=1)
                    Buffers: shared hit=6
              ->  Hash  (cost=1.25..1.25 rows=1 width=4) (actual time=0.008..0.008 rows=1 loops=1)
                    Buckets: 1024  Batches: 1  Memory Usage: 1kB
                    Buffers: shared hit=1
                    ->  Seq Scan on parameter  (cost=0.00..1.25 rows=1 width=4) (actual time=0.004..0.007 rows=1 loops=1)
                          Filter: ((name)::text = 'param_01'::text)
                          Buffers: shared hit=1
Total runtime: 2605.277 ms
Community
  • 1
  • 1
ssc
  • 9,528
  • 10
  • 64
  • 94
  • Three things that are not your problem, but that I don't understand: (1) If the reason for `parameter.id` is that `parameter.name` is not unique, then how come you have `CONSTRAINT parameter_uk_name UNIQUE (name)`? (2) What is the relationship (if any) between `measurement_data_index` and `measurement_data_value`? (3) What's with the implicit/comma/non-ANSI join? :-P – ruakh Sep 22 '12 at 14:04
  • Does `EXPLAIN (ANALYZE ON, BUFFERS ON) SELECT measurement_data_value.value FROM measurement_data_value JOIN parameter ON measurement_data_value.fk_parameter_id = parameter.id WHERE parameter.name = 'param_01';` give the same explain result? To me the sequential scan on parameters inside the loop looks a bit odd. – Joachim Isaksson Sep 22 '12 at 14:57
  • (1) valid point. That's an omission that crept in when I reduced the original data model to the form shown here. I've updated the question and removed the unique constraint. (2) as stated in the question, one measurement_data_index row has the same ID as many measurement_data_value rows (3) ANSI conformance is not important in the project and according to the postgres docs (http://www.postgresql.org/docs/9.1/static/tutorial-join.html), the 'comma' form I'm using is equivalent to an explicit join and "is not as commonly used". For me, it's just simpler to understand. – ssc Sep 22 '12 at 15:03
  • @Joachim: same result, same performance (within test variation). Incidentally, I've started to benchmark 'comma' / implicit and 'ANSI' / explicit version of JOINs as part of another SO question (http://stackoverflow.com/a/11523894/217844) and have not found any significant differences so far – ssc Sep 22 '12 at 15:08
  • 1) I don't see the need for separate measurement_data_index and measurement_data_value tables. To me, the *natural* key for a measurement-item would be {spot,device,timestamp}, leading directly to -> value. Since you have a valid natural key, 2) I also would omit the serial (which is not referred by any other table) 3) And I would make measurement_item.device_id an `integer NOT NULL references device(id)` 4) for the composite natural PK, extra indexes (in different order) could be necessary, depending on your specific query needs. – wildplasser Sep 22 '12 at 15:40
  • When you say that parameter name is not unique, do you mean just globally or you mean locally as well? I guess what I'm asking is: can the same machine have two different parameters with the same name? If yes, can the same machine _in the same spot_ have two different parameters with the same name? – Branko Dimitrijevic Sep 22 '12 at 18:37
  • Also, how are the measurements taken? Do you have a bulk of read-outs that come-in at essentially same time (For the whole machine? Just for the given spot?), or you measure each parameter one-by-one? – Branko Dimitrijevic Sep 22 '12 at 18:38
  • I still have the gut feeling that the {spot,device,parameter} triplet is not in BCNF, they look partially dependant. Could be restructured into a {where,what} pair + an additional what -> {how,which} relation. – wildplasser Sep 23 '12 at 10:36
  • Your worst case would take a lot more space. You are forgetting the overhead of storage, most importantly the [heap tuple header that takes 23 bytes per row](http://www.postgresql.org/docs/current/interactive/storage-page-layout.html) and [data alignment](http://stackoverflow.com/questions/2966524/calculating-and-saving-space-in-postgresql/7431468#7431468). Find out about real space requirements by [measuring your test tables](http://stackoverflow.com/a/8978207/939860). – Erwin Brandstetter Sep 23 '12 at 13:36
  • Your update illutrates what I feared: the {spot_id,device,parameter} keyset is used to address a set of only 1000 unique (combinations of) values. You are wasting a lot of index/key space there. – wildplasser Sep 23 '12 at 13:38
  • "ID of a value row is equal to the ID of the index it belongs to." No it's not. If it were, there'd be a foreign key constraint enforcing that hope. (Go ahead and add that now.) – Mike Sherrill 'Cat Recall' Sep 23 '12 at 13:42
  • I'm going to wait for the DVD to come out – Bohemian Sep 23 '12 at 13:42
  • @wildplasser: I don't really understand what you're saying. Where am I wasting space ? How would you structure this ? – ssc Sep 23 '12 at 14:24
  • @ErwinBrandstetter: Thanks for pointing that out. I would assume whatever overhead there is applies to both solutions (single table / index & value table) in somewhat the same manner, so why actual figures might not correspond to reality on the harddrive, the percental difference should still be similar. – ssc Sep 23 '12 at 14:33
  • I am working on it. It is called "data modelling" ;-) – wildplasser Sep 23 '12 at 14:43

4 Answers4

6

I basically revised your whole setup. Tested under PostgreSQL 9.1.5.

DB schema

  • I think that your table layout has a major logical flaw (as also pointed out by @Catcall). I changed it the way I suspect it should be:
    Your last table measurement_data_value (which I renamed to measure_val) is supposed to save a value per parameter (now: param) for every row in measurement_data_index (now: measure). See below.

  • Even though "a device has a unique name" use an integer surrogate primary key anyway. Text strings are inherently bulkier and slower to be used as foreign keys in big tables. They are also subject to collation, which can slow down queries significantly.

    Under this related question we found that joining and sorting on a medium sized text column was the major slow-down. If you insist on using a text string as primary key, read up on collation support in PostgreSQL 9.1 or later.

  • Don't fall for the anti-pattern of using id as name for a primary key. When you join a couple of tables (like you will have to do a lot!) you end up with several columns name id - what a mess! (Sadly, some ORMs use it.)

    Instead, name a surrogate primary key column after the table somehow to make it meaningful on its own. Then you can have foreign keys referencing it have the same name (that's a good, as they contain the same data).

    CREATE TABLE spot
    ( spot_id SERIAL PRIMARY KEY);
  • Don't use super-long identifiers. They are hard to type and hard to read. Rule of thumb: as long a necessary to be clear, as short as possible.

  • Don't use varchar(n) if you don't have a compelling reason. Just use varchar, or simpler: just text.

All this and more went into my proposal for a better db schema:

CREATE TABLE device
( device_id serial PRIMARY KEY 
 ,device text NOT NULL
);

CREATE TABLE param
( param_id serial PRIMARY KEY
 ,param text NOT NULL
);
CREATE INDEX param_param_idx ON param (param); -- you are looking up by name!

CREATE TABLE spot
( spot_id  serial PRIMARY KEY);

CREATE TABLE measure
( measure_id serial PRIMARY KEY
 ,device_id int NOT NULL REFERENCES device (device_id) ON UPDATE CASCADE
 ,spot_id int NOT NULL REFERENCES spot (spot_id) ON UPDATE CASCADE
 ,t_stamp timestamp NOT NULL
 ,CONSTRAINT measure_uni UNIQUE (device_id, spot_id, t_stamp)
);

CREATE TABLE measure_val   -- better name? 
( measure_id int NOT NULL REFERENCES measure (measure_id)
                 ON UPDATE CASCADE ON DELETE CASCADE  -- guessing it fits
 ,param_id int NOT NULL REFERENCES param (param_id)
                 ON UPDATE CASCADE ON DELETE CASCADE  -- guessing it fits
 ,value text NOT NULL
 ,CONSTRAINT measure_val_pk PRIMARY KEY (measure_id, param_id)
);
CREATE INDEX measure_val_param_id_idx ON measure_val (param_id);  -- !crucial!

I renamed the bulky measurement_data_value to measure_val, because that's what's in the table: parameter-values for measurements. Now, the multi-column pk makes sense, too.

But I added a separate index on param_id. The way you had it, column param_id was the second column in a multi-column index, which leads to poor results for param_id. Read all the gory details about that under this related question on dba.SE.

After implementing this alone, your query should be faster. But there is more you can do.

Test data

This fills in the data much faster. The point is that I use set-based DML commands, executing mass-inserts instead of loops that execute individual inserts, which takes forever. Makes quite a difference for the considerable amount of test data you want to insert. It's also much shorter and simpler.

To make it even more efficient, I use a data-modifying CTE (new in Postgres 9.1) that instantly reuses the massive amount of rows in the last step.

CREATE OR REPLACE FUNCTION insert_data()
RETURNS void LANGUAGE plpgsql AS
$BODY$
BEGIN
   INSERT INTO device (device)
   SELECT 'dev_' || to_char(g, 'FM00')
   FROM generate_series(1,5) g;

   INSERT INTO param (param)
   SELECT 'param_' || to_char(g, 'FM00')
   FROM generate_series(1,20) g;

   INSERT INTO spot (spot_id)
   SELECT nextval('spot_spot_id_seq'::regclass)
   FROM generate_series(1,10) g; -- to set sequence, too

   WITH x AS (
      INSERT INTO measure (device_id, spot_id, t_stamp)
      SELECT d.device_id, s.spot_id, g
      FROM   device    d
      CROSS  JOIN spot s
      CROSS  JOIN generate_series('2012-01-06 23:00:00' -- smaller set
                                 ,'2012-01-07 00:00:00' -- for quick tests
                                 ,interval '1 min') g
      RETURNING *
      )
   INSERT INTO measure_val (measure_id, param_id, value)
   SELECT x.measure_id
         ,p.param_id
         ,x.device_id || '_' || x.spot_id || '_' || p.param
   FROM  x
   CROSS JOIN param p;
END
$BODY$;

Call:

SELECT insert_data();

Query

  • Use explicit JOIN syntax and table aliased to make your queries easier to read and debug:
SELECT v.value
FROM   param p
JOIN   measure_val v USING (param_id)
WHERE  p.param = 'param_01';

The USING clause is just for simplifying the syntax, but not superior to ON otherwise.

This should be much faster now for two reasons:

  • Index param_param_idx on param.param.
  • Index measure_val_param_id_idx on measure_val.param_id, like explained in detail here.

Edit after feedback

My major oversight was that you already had added the crucial index in form of measurement_data_value_idx_fk_parameter_id further down in your question. (I blame your cryptic names! :p ) On closer inspection, you have more than 10M (7 * 24 * 60 * 5 * 10 * 20) rows in your test setup and your query retrieves > 500K. I only tested with a much smaller subset.

Also, as you retrieve 5% of the whole table, indexes will only go so far. I was to optimistic, such an amount of data is bound to take some time. Is it a realistic requirement that you query 500k rows? I would assume you aggregate in your real life application?

Further options

  • Partitioning.
  • More RAM and settings that make use of it.

    A virtual Debian 6.0 machine with 1GB of RAM

    is way below what you need.

  • Partial indexes, especially in connection with index-only scans of PostgreSQL 9.2.

  • Materialized views of aggregated data. Obviously, you are not going to display 500K rows, but some kind of aggregation. You can compute that once and save results in a materialized view, from where you can retrieve data much faster.
  • If your queries are predominantly by parameter (like the example), you could use CLUSTER to physically rewrite the table according to an index:

    CLUSTER measure_val USING measure_val_param_id_idx
    

    This way all rows for one parameter are stored in succession. Means fewer block to read and easier to cache. Should make the query at hand much faster. Or INSERT the rows in favorable order to begin with, to the same effect.
    Partitioning would mix well with CLUSTER, since you would not have to rewrite the whole (huge) table every time. As your data is obviously just inserted and not updated, a partition would stay "in order" after CLUSTER.

  • Generally, PostgreSQL 9.2 should be great for you as its improvements focus on performance with big data.

Community
  • 1
  • 1
Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
  • Thank you very much for your detailed answer! :-) I have run your code: Inserting data takes ~9-10 mins now as opposed to ~10-11 mins before. That point is somewhat moot though as the real database anyway does a bulk load of the test data. Running the query takes ~1300ms which is the same as before (within test variance). – ssc Sep 23 '12 at 09:29
  • @ssc: That's disappointing. My major oversight was that you already had added the crucial index. I added a bit to my answer. – Erwin Brandstetter Sep 23 '12 at 13:22
  • if RAM is an issue here, how come my performance tests show the same figures for the VM with 1GB as for the MacBook with 8GB ? – ssc Sep 23 '12 at 14:37
  • You have to run tests repeatedly to populate the cache. Your 10M rows roughly occupy 500 MB to 1 GB on disc and a bit more in RAM. Add indexes to that. A machine with 1 GB of RAM obviously can't cache that. Either way, to make the query faster, use PostgreSQL 9.2. Materialized views of aggregated data could also help, I added a line about that. – Erwin Brandstetter Sep 23 '12 at 22:50
  • You have to run tests multiple to populate the cache. Your 10M rows roughly occupy 500 MB to 1 GB on disc and a bit more in RAM. Add indexes to that. A machine with 1 GB of RAM obviously can't cache that. Either way, to make the query faster, use PostgreSQL 9.2. I also added hints for *materialized views* and `CLUSTER` to my answer. – Erwin Brandstetter Sep 24 '12 at 01:39
  • I usually run tests 3 times and I seem to notice a pattern where the first time is often a bit slower and the second and third time are similar; the difference is never more than e.g. 20% though. From what I read in the 'High Performance' book, that's the effects of DB and OS caches kicking in. Thanks for the tip on materialized views, I'll start reading up on that topic. – ssc Sep 24 '12 at 03:34
  • @ErwinBrandstetter - the 1GB VM might not be caching the disk-blocks, but I'd bet the hosting 8GB is. – Richard Huxton Jun 05 '14 at 18:49
4

The idea behind this "solution" is: avoid the separate key-domains for {device,spot,paramater}. There are only 1000 possible combinations of these three. (could be seen as a bad case of BCNF-violation). So I combine them into one what_how_where table, which refers to the tree separate domains. The number of key-elements in the measurement(data) table is reduced from 4 to 2, and the surrogate key is omitted (since it is not used) The what_how_where table does have a surrogate key. I's meaning can be expressed as: if a tuple is present in this table: the parameter 'what' can be measured by device 'how' on location 'where".

-- temp schema for scratch
DROP SCHEMA tmp CASCADE;
CREATE SCHEMA tmp;
SET search_path=tmp;

        -- tables for the three "key domain"s
CREATE TABLE device
        ( id SERIAL NOT NULL PRIMARY KEY
        , dname VARCHAR NOT NULL -- 'name' might be a reserve word
        , CONSTRAINT device_name UNIQUE (dname)
        );

CREATE TABLE parameter
        ( id SERIAL PRIMARY KEY -- must have ID as names are not unique
        , pname VARCHAR NOT NULL
        );

CREATE TABLE spot
        ( id SERIAL PRIMARY KEY
        , sname VARCHAR NOT NULL
        );
        -- One table to combine the three "key domain"s
CREATE TABLE what_how_where
        ( id SERIAL NOT NULL PRIMARY KEY
        , device_id INTEGER NOT NULL REFERENCES device(id)
        , spot_id INTEGER NOT NULL REFERENCES spot(id)
        , parameter_id INTEGER NOT NULL REFERENCES parameter(id)
        , CONSTRAINT what_natural UNIQUE (device_id,spot_id,parameter_id)
        );

CREATE TABLE measurement
        ( whw_id INTEGER NOT NULL REFERENCES what_how_where(id)
        , t_stamp TIMESTAMP NOT NULL
        , value VARCHAR(32) NOT NULL
        , CONSTRAINT measurement_natural PRIMARY KEY (t_stamp,whw_id)
        );

INSERT INTO device (dname)
SELECT 'dev_' || d::text
FROM generate_series(1,10) d;

INSERT INTO parameter (pname)
SELECT 'param_' || p::text
FROM generate_series(1,10) p;

INSERT INTO spot (sname)
SELECT 'spot_' || s::text
FROM generate_series(1,10) s;

INSERT INTO what_how_where (device_id,spot_id,parameter_id)
SELECT d.id,s.id,p.id
FROM device d
JOIN spot s ON(1=1)
JOIN parameter p ON(1=1)
        ;
ANALYSE what_how_where;

INSERT INTO measurement(whw_id, t_stamp, value)
SELECT w.id
        , g
        , random()::text
FROM what_how_where w
JOIN generate_series('2012-01-01'::date, '2012-09-23'::date, '1 day'::interval) g
        ON (1=1)
        ;

CREATE UNIQUE INDEX measurement_natural_reversed ON measurement(whw_id,t_stamp);
ANALYSE measurement;

        -- A view to *more or less* emulate the original behaviour
DROP VIEW measurement_data ;
CREATE VIEW measurement_data AS (
        SELECT d.dname AS dname
        , p.pname AS pname
        , w.spot_id AS spot_id
        , w.parameter_id AS parameter_id
        , m.t_stamp AS t_stamp
        , m.value AS value
        FROM measurement m
        JOIN what_how_where w ON m.whw_id = w.id
        JOIN device d ON w.device_id = d.id
        JOIN parameter p ON w.parameter_id = p.id
        );


EXPLAIN (ANALYZE ON, BUFFERS ON)
SELECT md.value
  FROM measurement_data md
 WHERE md.pname = 'param_8'
   AND md.t_stamp >= '2012-07-01'
   AND md.t_stamp < '2012-08-01'
        ;

UPDATE: there is one practical problem, which can only be solved by some kind of clustering:

  • given an estimated row size of 50 bytes
  • and a specificity of the query of 5% only (1/20) of the parameters is wanted
  • which means that about 4 "wanted" tuples live on a OS disk page (+76 unwanted tuples)

Without clustering, this means that all the pages have to been pulled in + inspected. Indexes do not help here (they only help if they can avoid pages being pulled in, this could be the case for a (range)search on the first key column(s)) The indexes may help a bit for scanning the in-memory pages after these have been fetched.

As a consequence, this means that (once the footprint of your query is larger that available bufferspace) your query actually measures the I/O speed of your machine.

wildplasser
  • 43,142
  • 8
  • 66
  • 109
  • That should reduce the size of the 2nd biggest table quite a bit, but the size of the biggest table (dominating space requirements) remains unchanged. This particular query will even be slower, because now you have to join to the bigger table and resolve more complex conditions. Still, an interesting variant. – Erwin Brandstetter Sep 23 '12 at 15:22
  • I beg to differ. The usage of an index can be viewed as the "physical variant" of "joining a table". In my case, the joined table is very small(1K records) , in the OP's version, three separate index joins would have been needed (on very "sparse" indexes) Also the clustering/fragmentation would be worse for four key-elements. BTW: the number of reported rows for the OP (400K) is large; if these rows are scattered all over the disk, a typical I/O cost of one disk page per tuple could be expected. – wildplasser Sep 23 '12 at 15:36
  • So if I understand your approach correctly, you have swapped the parameter FK with the timestamp, thus making the 'index' table 'constant' with a size of 1000 rows. If we apply the same crude storage space estimate: The measurement table is now 4 bytes larger and we still get 1440 meas/day, so 29x1.44MB = ~40MB/day, which is not a lot more than the ~39MB of my approach, so there's no drastic impact on storage requirement. – ssc Sep 23 '12 at 17:27
  • When I run your code with the same data range as in my approach, the same query seems to run ~10% slower than in my or @ErwinBrandstetter's approach. The query plan looks more complex (which might be irrelevant for performance, but matters for my understanding ;-), but that might be due to the view. There is no need for 'backward compatibility', so I can skip that. I think I'll change my code to using your approach for testing and do a few measurements. Thanks! :-) – ssc Sep 23 '12 at 17:35
  • Try adding a clause on one of the other key-elements (e.g. t_stamp). A selectivity of 10 % for a where-clause is borderline, in the presence of ignored other key elements, IMHO. Also: my "size" argument was not about the physical size of rows or tables, but about the "address space" of a key "domain". (the "cardinality" of the domain). NB: a two-key model is also cheaper wrt the inserting process. (less index overhead). The paln complexity is irrelevant (the inner queries execute in sub millisecond times), but the remaining filter or join operation in the outer loop is cheaper than original. – wildplasser Sep 23 '12 at 18:14
  • I did notice it seems in fact _a lot_ cheaper to insert - creating and filling the sample DB took ~5mins as opposed to ~10mins with earlier solutions. – ssc Sep 23 '12 at 19:46
  • Fewer FK constraints, fewer indexes -> less disk traffic. – wildplasser Sep 23 '12 at 19:55
  • 1
    Changing the data model as described slashes stored procedure execution times from seconds or even minutes to milliseconds. Unfortunately, the project has been cancelled in the meantime, so I won't be able to investigate this further - but what I'm taking away from this is: If performance is really bad for even trivial queries, take a long hard look at the data model. Thanks for your help! :-) – ssc Jan 11 '13 at 07:58
  • Well, data modelling is an art. It is also (part of) my work. Sorry about the project. – wildplasser Jan 12 '13 at 09:45
2

I don't see how you relate a particular measured value with a particular combination of device, spot, and time. Am I missing something obvious?

Let's look at it a different way.

CREATE TABLE measurement_data
(
  device_name character varying(16) NOT NULL,
  spot_id integer NOT NULL,
  parameter_id integer NOT NULL,
  t_stamp timestamp without time zone NOT NULL,
  value character varying(16) NOT NULL,
  CONSTRAINT measurement_data_pk PRIMARY KEY (device_name , spot_id , t_stamp , parameter_id ),
  CONSTRAINT measurement_data_fk_device FOREIGN KEY (device_name)
      REFERENCES device (name) MATCH FULL
      ON UPDATE NO ACTION ON DELETE NO ACTION,
  CONSTRAINT measurement_data_fk_parameter FOREIGN KEY (parameter_id)
      REFERENCES parameter (id) MATCH SIMPLE
      ON UPDATE NO ACTION ON DELETE NO ACTION,
  CONSTRAINT measurement_data_fk_spot FOREIGN KEY (spot_id)
      REFERENCES spot (id) MATCH FULL
      ON UPDATE NO ACTION ON DELETE NO ACTION
);

(An even better name for this table is "measurements". Every table contains data.)

I'd expect much better performance on this kind of table. But I'd also expect any query that returns many, many rows to struggle with performance. (Unless the hardware and the network matches the task.)

Mike Sherrill 'Cat Recall'
  • 91,602
  • 17
  • 122
  • 185
  • I hope I was able to explain the need for - and the relationship between - the index and value tables in my 'update 01' – ssc Sep 23 '12 at 12:58
1

It seems from the numbers that you are being hit by timing overhead. You can verify this by using pg_test_timing or adding timing off to your explain parameters (both are introduced in PostgreSQL version 9.2). I can approximately replicate your results by turning setting my clocksource to HPET instead of TSC.

With HPET:

 Nested Loop  (cost=8097.73..72850.98 rows=432000 width=12) (actual time=29.188..905.765 rows=432000 loops=1)
   Buffers: shared hit=56216
   ->  Seq Scan on parameter  (cost=0.00..1.25 rows=1 width=4) (actual time=0.004..0.008 rows=1 loops=1)
         Filter: ((name)::text = 'param_01'::text)
         Rows Removed by Filter: 19
         Buffers: shared hit=1
   ->  Bitmap Heap Scan on measurement_data_value  (cost=8097.73..68529.73 rows=432000 width=16) (actual time=29.180..357.848 rows=432000 loops=1)
         Recheck Cond: (fk_parameter_id = parameter.id)
         Buffers: shared hit=56215
         ->  Bitmap Index Scan on measurement_data_value_idx_fk_parameter_id  (cost=0.00..7989.73 rows=432000 width=0) (actual time=21.710..21.710 rows=432000 loops=1)
               Index Cond: (fk_parameter_id = parameter.id)
               Buffers: shared hit=1183
 Total runtime: 1170.409 ms

With HPET and timing off:

 Nested Loop  (cost=8097.73..72850.98 rows=432000 width=12) (actual rows=432000 loops=1)
   Buffers: shared hit=56216
   ->  Seq Scan on parameter  (cost=0.00..1.25 rows=1 width=4) (actual rows=1 loops=1)
         Filter: ((name)::text = 'param_01'::text)
         Rows Removed by Filter: 19
         Buffers: shared hit=1
   ->  Bitmap Heap Scan on measurement_data_value  (cost=8097.73..68529.73 rows=432000 width=16) (actual rows=432000 loops=1)
         Recheck Cond: (fk_parameter_id = parameter.id)
         Buffers: shared hit=56215
         ->  Bitmap Index Scan on measurement_data_value_idx_fk_parameter_id  (cost=0.00..7989.73 rows=432000 width=0) (actual rows=432000 loops=1)
               Index Cond: (fk_parameter_id = parameter.id)
               Buffers: shared hit=1183
 Total runtime: 156.537 ms

With TSC:

 Nested Loop  (cost=8097.73..72850.98 rows=432000 width=12) (actual time=29.090..156.233 rows=432000 loops=1)
   Buffers: shared hit=56216
   ->  Seq Scan on parameter  (cost=0.00..1.25 rows=1 width=4) (actual time=0.004..0.008 rows=1 loops=1)
         Filter: ((name)::text = 'param_01'::text)
         Rows Removed by Filter: 19
         Buffers: shared hit=1
   ->  Bitmap Heap Scan on measurement_data_value  (cost=8097.73..68529.73 rows=432000 width=16) (actual time=29.083..114.908 rows=432000 loops=1)
         Recheck Cond: (fk_parameter_id = parameter.id)
         Buffers: shared hit=56215
         ->  Bitmap Index Scan on measurement_data_value_idx_fk_parameter_id  (cost=0.00..7989.73 rows=432000 width=0) (actual time=21.667..21.667 rows=432000 loops=1)
               Index Cond: (fk_parameter_id = parameter.id)
               Buffers: shared hit=1183
 Total runtime: 168.869 ms

So your slowness seems to be mostly caused by instrumentation overhead. However, selecting huge amounts of rows won't be extremely fast in PostgreSQL. If you need to do number crunching on large swathes of data it might be a good idea to structure your data so you can fetch it in larger chunks. (e.g. if you need to always process at least a days worth of data, aggregate all measurements for one day into an array)

In general, you have to have a idea of what your workload is going to be to do tuning. What is a win in one case might be a big loss in some other case. I recommend checking out pg_stat_statements to figure out where your bottlenecks are.

Ants Aasma
  • 53,288
  • 15
  • 90
  • 97
  • That's quite a difference! (and I think it could also explain why I see the same timing results on a small 1GB machine as on 8GB. However, I can not reproduce your results - if I do \timing off before the explain analyze, the results are pretty much the same. – ssc Sep 23 '12 at 14:42
  • On a sidenote, I think \timing on/off was introduced in 8.4, see e.g. http://www.depesz.com/2008/06/11/waiting-for-84-waiting-onoff, so maybe I only would have to update my server to use pg_test_timing ?!? – ssc Sep 23 '12 at 14:44
  • Could you kindly add some info to your answer what steps you have taken to get the 2nd and 3rd result ? I have no idea how to set my clocksource to HPET or TSC (my own googling pending...) – ssc Sep 23 '12 at 14:46
  • To switch between hpet and tsc clock sources I used the following shell command "echo hpet > /sys/devices/system/clocksource/clocksource0/current_clocksource" – Ants Aasma Sep 23 '12 at 17:39
  • And to do the explain without timing, I used EXPLAIN (ANALYZE ON, TIMING OFF) as documented here: http://www.postgresql.org/docs/9.2/static/sql-explain.html – Ants Aasma Sep 23 '12 at 17:40
  • Thanks for the clarification! :-) Seems I was looking at the wrong 'timing off'. I'll update the server on a VM and see if I can reproduce your results. On a side note: Isn't 80% timing overhead quite a lot ? – ssc Sep 23 '12 at 17:50