3

We have a pretty big plpgsql function with an if- and elsif-statement in PostgreSQL 9.4.4 Inside every if-body there are function calls to stable-sql functions.

We call the function in the following way:

SELECT * 
from rawdata.getNumbersForUserBasedMetricEventsGroupedByClient('2015-09-28','2015-10-28','{4}'::int[],2,null,null,null,null,null); 

The first 4-5 times the function executes quite fast in a about 2.5 seconds, but then suddenly the performance drops rapidly and the execution takes about 7.5 seconds. It stays at that level for all consecutive calls. We also tried to declare the plpgsql function as stable, but that did not help.

When we call one of the inner stable-sql functions directly, the executions always take about 2.5 seconds.

This is the Schema of the rawdata.metricevent table:

rawdata.metricevent (metriceventid bigint PRIMARY KEY,
                     metricevent integer,
                     client integer,
                     age integer,
                     country varchar(256),
                     userideventowner bigint,
                     contributoruserid bigint,
                     tournamentid bigint,
                     eventoccurtime timestamp,
                     iscounted boolean)

We have a btree index over the eventoccurtime column. Without the btree index the difference is even bigger, the execution sometimes finished in just a few seconds, but sometimes it lasts more than 100 seconds.

Now our questions are: Why is that? What is happening, when the plpgsql function is executed the 5th or 6th time, why is it suddenly taking so long? Btw, the CPU-Load also is very high for this queries. We also analyzed the query with EXPLAIN ANALYZE and the query planner ALWAYS takes about 0.034ms, but the query execution differs from 2.5 seconds to 7.5 seconds. And it also never is anywhere in between, its either 2.5 seconds or 7.5 seconds.

These are the Main-pgpsql function that has the variable execution times and the stable-sql function below that have constant execution times.

CREATE OR REPLACE FUNCTION rawdata.getNumbersForUserBasedMetricEventsGroupedByClient(pFrom timestamp, pTo timestamp, pMetricEvent integer[], pTimeDomainType integer,
                                                                            pCountry varchar(100),pAgeFrom integer,pAgeTo integer,pUserlanguage varchar(50),pTournamentlanguage varchar(50))
RETURNS TABLE(dfrom timestamp, x bigint, y bigint, xx bigint, yy bigint)
AS $$
BEGIN
    IF pTimeDomainType = 1 THEN
        --hours
        RETURN QUERY
            SELECT * FROM rawdata.getNumbersForUBMetricEventsGroupedByClientPerHours(pFrom,pTo,pMetricEvent,pCountry,pAgeFrom,pAgeTo,pUserLanguage,pTournamentLanguage);
    ELSIF pTimeDomainType = 2 THEN
        --days
        RETURN QUERY
            SELECT * FROM rawdata.getNumbersForUBMetricEventsGroupedByClientPerDays(pFrom,pTo,pMetricEvent,pCountry,pAgeFrom,pAgeTo,pUserLanguage,pTournamentLanguage);

    ELSIF pTimeDomainType = 3 THEN
        --week
        RETURN QUERY
            SELECT * FROM rawdata.getNumbersForUBMetricEventsGroupedByClientPerWeeks(pFrom,pTo,pMetricEvent,pCountry,pAgeFrom,pAgeTo,pUserLanguage,pTournamentLanguage);
    ELSIF pTimeDomainType = 4 THEN
        --month
        RETURN QUERY
            SELECT * FROM rawdata.getNumbersForUBMetricEventsGroupedByClientPerMonths(pFrom,pTo,pMetricEvent,pCountry,pAgeFrom,pAgeTo,pUserLanguage,pTournamentLanguage);   
    END IF;
END;
$$
LANGUAGE plpgsql;

CREATE OR REPLACE FUNCTION rawdata.getNumbersForUBMetricEventsGroupedByClientPerHours(pFrom timestamp, pTo timestamp, pMetricEvent integer[],
                                                                            pCountry varchar(100),pAgeFrom integer,pAgeTo integer,pUserlanguage varchar(50),pTournamentlanguage varchar(50))
RETURNS TABLE(dfrom timestamp, x bigint, y bigint, xx bigint, yy bigint)
AS $$
        SELECT hours timedomain,count(distinct em.userideventowner) as x,count(distinct ef.userideventowner) as y,count(distinct emh.userideventowner) as xx,count(distinct efh.userideventowner) as yy
        FROM generate_series
                ( pFrom::timestamp 
                , pTo::timestamp + '23 hour'
                , '1 hour'::interval) hours
           LEFT JOIN rawdata.metricevent e1 ON e1.eventoccurtime >=pFrom 
                                       AND e1.eventoccurtime < pTo + '1 day'
                                       AND (e1.metricevent = ANY (pMetricEvent))
                                       AND (e1.country = pCountry OR pCountry is null)
                                       AND (e1.age >= pAgeFrom OR pAgeFrom is null) AND (e1.age <= pAgeTo OR pAgeTo is null)
                                       AND userideventowner >= 110
                                       AND hours = date_trunc('hour',e1.eventoccurtime)
           LEFT JOIN rawdata.userlanguage ul ON e1.userideventowner = ul.userideventowner
                                                    AND (ul.userlanguage = pUserLanguage OR pUserLanguage is null)
           LEFT JOIN rawdata.metricevent ei ON e1.metriceventid = em.metriceventid AND ei.client=1
           LEFT JOIN rawdata.metricevent ea ON e1.metriceventid = ef.metriceventid AND ea.client=2
           LEFT JOIN rawdata.metricevent ew ON e1.metriceventid = emh.metriceventid AND ew.client=3
           LEFT JOIN rawdata.metricevent eww ON e1.metriceventid = efh.metriceventid AND eww.client=4
        GROUP BY hours
        ORDER BY hours;
$$
LANGUAGE sql STABLE;

CREATE OR REPLACE FUNCTION rawdata.getNumbersForUBMetricEventsGroupedByClientPerDays(pFrom timestamp, pTo timestamp, pMetricEvent integer[],
                                                                            pCountry varchar(100),pAgeFrom integer,pAgeTo integer,pUserlanguage varchar(50),pTournamentlanguage varchar(50))
RETURNS TABLE(dfrom timestamp, x bigint, y bigint, xx bigint, yy bigint)
AS $$
        SELECT days timedomain,count(distinct em.userideventowner) as x,count(distinct ef.userideventowner) as y,count(distinct emh.userideventowner) as xx,count(distinct efh.userideventowner) as yy
        FROM generate_series
                ( pFrom::timestamp 
                , pTo::timestamp
                , '1 day'::interval) days
           LEFT JOIN rawdata.metricevent e1 ON e1.eventoccurtime >=pFrom 
                               AND e1.eventoccurtime < pTo + '1 day'
                               AND (e1.metricevent = ANY (pMetricEvent))
                               AND (e1.country = pCountry OR pCountry is null)
                               AND (e1.age >= pAgeFrom OR pAgeFrom is null) AND (e1.age <= pAgeTo OR pAgeTo is null)
                               AND userideventowner >= 110
                               AND days = date_trunc('day',e1.eventoccurtime)                          
           LEFT JOIN rawdata.userlanguage ul ON e1.userideventowner = ul.userideventowner
                                            AND (ul.userlanguage = pUserLanguage OR pUserLanguage is null)
           LEFT JOIN rawdata.metricevent ei ON e1.metriceventid = em.metriceventid AND ei.client=1
           LEFT JOIN rawdata.metricevent ea ON e1.metriceventid = ef.metriceventid AND ea.client=2
           LEFT JOIN rawdata.metricevent ew ON e1.metriceventid = emh.metriceventid AND ew.client=3
           LEFT JOIN rawdata.metricevent eww ON e1.metriceventid = efh.metriceventid AND eww.client=4
        GROUP BY days
        ORDER BY days;
$$
LANGUAGE sql STABLE;

CREATE OR REPLACE FUNCTION rawdata.getNumbersForUBMetricEventsGroupedByClientPerWeeks(pFrom timestamp, pTo timestamp, pMetricEvent integer[],
                                                                            pCountry varchar(100),pAgeFrom integer,pAgeTo integer,pUserlanguage varchar(50),pTournamentlanguage varchar(50))
RETURNS TABLE(dfrom timestamp, x bigint, y bigint, xx bigint, yy bigint)
AS $$
        SELECT min(days) timedomain,count(distinct em.userideventowner) as x,count(distinct ef.userideventowner) as y,count(distinct emh.userideventowner) as xx,count(distinct efh.userideventowner) as yy
                FROM generate_series
                        ( pFrom::timestamp 
                        , pTo::timestamp
                        , '1 day'::interval) days
                   LEFT JOIN rawdata.metricevent e1 ON e1.eventoccurtime >=pFrom 
                                       AND e1.eventoccurtime < pTo + '1 day'
                                       AND (e1.metricevent = ANY (pMetricEvent)) 
                                       AND (e1.country = pCountry OR pCountry is null)
                                       AND (e1.age >= pAgeFrom OR pAgeFrom is null) AND (e1.age <= pAgeTo OR pAgeTo is null)
                                       AND userideventowner >= 110
                                       AND days = date_trunc('day',e1.eventoccurtime)
                   LEFT JOIN rawdata.userlanguage ul ON e1.userideventowner = ul.userideventowner
                                                    AND (ul.userlanguage = pUserLanguage OR pUserLanguage is null)
                   LEFT JOIN rawdata.metricevent ei ON e1.metriceventid = em.metriceventid AND ei.client=1
                   LEFT JOIN rawdata.metricevent ea ON e1.metriceventid = ef.metriceventid AND ea.client=2
                   LEFT JOIN rawdata.metricevent ew ON e1.metriceventid = emh.metriceventid AND ew.client=3
                   LEFT JOIN rawdata.metricevent eww ON e1.metriceventid = efh.metriceventid AND eww.client=4
                GROUP BY EXTRACT(WEEK FROM days)
                ORDER BY 1;
$$
LANGUAGE sql STABLE;

CREATE OR REPLACE FUNCTION rawdata.getNumbersForUBMetricEventsGroupedByClientPerMonths(pFrom timestamp, pTo timestamp, pMetricEvent integer[],
                                                                            pCountry varchar(100),pAgeFrom integer,pAgeTo integer,pUserlanguage varchar(50),pTournamentlanguage varchar(50))
RETURNS TABLE(dfrom timestamp, x bigint, y bigint, xx bigint, yy bigint)
AS $$
        SELECT min(days) timedomain,count(distinct em.userideventowner) as x,count(distinct ef.userideventowner) as y,count(distinct emh.userideventowner) as xx,count(distinct efh.userideventowner) as yy
                FROM generate_series
                        ( pFrom::timestamp 
                        , pTo::timestamp
                        , '1 day'::interval) days
                   LEFT JOIN rawdata.metricevent e1 ON e1.eventoccurtime >=pFrom 
                                       AND e1.eventoccurtime < pTo + '1 day'
                                       AND (e1.metricevent = ANY (pMetricEvent)) 
                                       AND (e1.country = pCountry OR pCountry is null)
                                       AND (e1.age >= pAgeFrom OR pAgeFrom is null) AND (e1.age <= pAgeTo OR pAgeTo is null)
                                       AND userideventowner >= 110
                                       AND days = date_trunc('day',e1.eventoccurtime)
                   LEFT JOIN rawdata.userlanguage ul ON e1.userideventowner = ul.userideventowner
                                                    AND (ul.userlanguage = pUserLanguage OR pUserLanguage is null)LEFT JOIN rawdata.metricevent ei ON e1.metriceventid = em.metriceventid AND ei.client=1
                   LEFT JOIN rawdata.metricevent ea ON e1.metriceventid = ef.metriceventid AND ea.client=2
                   LEFT JOIN rawdata.metricevent ew ON e1.metriceventid = emh.metriceventid AND ew.client=3
                   LEFT JOIN rawdata.metricevent eww ON e1.metriceventid = efh.metriceventid AND eww.client=4
                GROUP BY EXTRACT(MONTH FROM days)
                ORDER BY 1;         
$$
LANGUAGE sql STABLE;

Kind regards, Thomas

Thomas
  • 41
  • 4
  • Do you call it from PgJDBC by any chance? – Craig Ringer Oct 31 '15 at 01:42
  • Well, we are using the Postgresql JDBC Driver for Java to access the db from our data access layer. But the same behaviour occurs when we execute the queries directly from pgAdmin. So that should not be the issue in my opinion. – Thomas Nov 01 '15 at 11:09

0 Answers0