0

I have a table with yearly values for 200+ countries. For a graphical representation, I'd like to get the percentage change between two specific years, 1990 and 2013.

The table looks a bit like this:

  id_country       year       value
    886            2002      161.348
    886            2003      161.348
    886            2004      176.016
    886            2005      176.016
    886            2006      179.683
    886            2007      183.35
    886            2008      201.685
    886            2009      227.354
    886            2010      234.688
    886            2011      245.689
    886            2012      293.36
    886            2013      440.04
    620            1990      40.337
    620            1991      1056.096
    620            1992      1151.438
    620            1993      1389.793
    620            1994      1584.144
    620            1995      1631.815
    620            1996      1749.159
    620            1997      1796.83
    620            1998      1906.84
    620            1999      1664.818
    620            2000      1642.816
    620            2001      2016.85
    620            2002      1760.16
    620            2003      1873.837
    620            2004      1961.845
    620            2005      2310.21
    620            2006      2328.545
    620            2007      2361.548
    620            2008      3329.636
    620            2009      3069.279
    620            2010      3098.615
    620            2011      2823.59
    620            2012      3373.64
    620            2013      2948.268

I thought the best way would be to produce a VIEW with the id_country, which calculates that difference. But I have no clue how that query would look like. It must SELECT all countries, and then divide year = 2013 by year = 1990 for each of these countries.

It could get more complicated as there are multiple variables in that table (represented by additional columns), which would be needed to be filtered by those additional column values, like id_source = 1 or id_source = 2, or id_sector = 1 or id_sector = 2.

Any help is very much appreciated!

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
luftikus143
  • 1,285
  • 3
  • 27
  • 52

1 Answers1

1

One way, probably fastest:

CREATE VIEW pct_2013_1990 AS
SELECT id_country
     ,       (sum(value) FILTER (WHERE year = 2013) * 100)
     / NULLIF(sum(value) FILTER (WHERE year = 1990), 0) AS pct
FROM   tbl
WHERE  year IN (1990, 2013)
AND    id_source = 1 -- ??
GROUP  BY id_country
-- ORDER BY ???

This assumes you have a value > 0 for every country in year 1990, else you get a division by zero. I defend against that with NULLIF in the example. The result is NULL in this case.

pct is the percentage for the 2013 value as compared to 1990. To get the percentage change, you would subtract 100 from it. Not sure what you need exactly.

You might use round() to reduce fractional digits.

The aggregate FILTER clause was introduced with Postgres 9.4:

In older versions you can substitute with CASE expressions.

You could use a set-returning function instead and parameterize the years to make it work for any set of years.

CREATE FUNCTION f_pct_calc(year1 integer, year2 integer)
  RETURNS TABLE(id_country int, pct numeric) AS
$func$ 
    SELECT t.id_country
         ,       (sum(t.value) FILTER (WHERE year = $2) * 100)
         / NULLIF(sum(t.value) FILTER (WHERE year = $1), 0) AS pct
    FROM   tbl t
    WHERE  t.year IN ($1, $2)
    AND    t.id_source = 1 -- ??
    GROUP  BY t.id_country
    -- ORDER BY ???
$func$ LANGUAGE sql STABLE;

Call:

SELECT * FROM f_pct_calc(1990, 2013);
Community
  • 1
  • 1
Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228