0

Using PostgreSQL in a clustered database (stado) on two nodes. I managed to configure stado coordinator and nodes agents successfully but when I try running a heavy query, the first time it takes too long to show results then after that it was fast.

When I restart the server it goes slow again. It's like stado does some caching or something. I thought the problem was because of stado initialization and thus configured agents but still the problem exists! Any ideas?

EDIT

Query:

SELECT id,position,timestamp
FROM   table t1
WHERE  id <> 0
AND    ST_Intersects(ST_Buffer_Meters(ST_SetSRID(
                     ST_MakePoint(61.4019, 15.218205), 4326), 1160006), position)
AND    timestamp BETWEEN '2013-10-01' AND '2014-01-01';

Explain:

ٍٍStep 0
_______
Target: CREATE UNLOGGED TABLE "TMPTT7_1" ( "XCOL1" INT) WITHOUT OIDS
SELECT: SELECT count(*) AS "XCOL1" FROM "t1" WHERE "t1"."timestamp" BETWEEN '2013-10-01' AND '2014-01-01' AND ("t1"."id"<>0) AND ST_Intersects(ST_Buffer_Meters(ST_SetSRID(
                         ST_MakePoint(61.4019, 15.218205), 4326), 1160006), "t1"."position")

Step: 1
_______
Select: SELECT SUM("XCOL1") AS "EXPRESSION6" FROM "TMPTT7_1"
Drop:
TMPTT7_1
Shadin
  • 1,867
  • 5
  • 26
  • 37

2 Answers2

0

Two reasons.

Caching, obviously. When a query is executed the first time with cold cache, obviously the cache is populated. That goes for system cache as well as database cache, both work together, at least in standard Postgres. Can make a huge difference.

Query plan caching, possibly. To a much lesser degree. If you run the same query in a single session repeatedly, plans for PL/pgSQL functions for instance are cached.

Depending on your type of connection to the database, there may also be network latency, which may be higher for the first call.

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
  • Thanks! but what do you suggest me to do? the first query takes very long time. the strange thing is, when i run the query on a single node, no caching happens! but on the coordinator (execute query on both noes) it happens. – Shadin May 18 '14 at 07:21
  • @Oxi: Not sure about the specific role of stado. I assume it has its own caching mechanism that needs to be populated as well. – Erwin Brandstetter May 18 '14 at 16:59
0

Caching in memory is the reason, that is correct. A good tip for this type of situation is to "warm-up" the database each time you restart it with a script that runs the query (or a similar query that still accesses the same data). In some cases I have seen instances where several "warm-up" queries are run after any type of restart, then users still have a good experience. You will still have to wait for the warm-up query to finish after a restart, but at least it will not be a user waiting for that.

The other possibility is that you are doing a non-indexed query, you should check for that. If it is indexed and accessing a reasonable amount of data by a key, then it should be fast (even without the warm-up for most queries). This is a very common problem, easy to miss. Use the Postres EXPLAIN command, it will show you how the query is being performed against the database (i.e., with an index or without).

dbschwartz
  • 763
  • 3
  • 10
  • Thank you! well, here is my query and the explain result "added in the post". It accesses a HUGE amount of data "30,000,000 rows" so i think the "warm-up" option is the best i can do. what I'm going to do is to set up a several "count" queries for the data to run after each restart. what do you think? would "count queries" be enough to scan and cache the data and prepare it for any other query? – Shadin May 20 '14 at 05:17