1

I'm using Postgres 9.3 and have the following four tables to have maximum flexibility regarding price and / or tax / taxe rate changes in the future (see below for more details):

CREATE TABLE main.products
(
  id serial NOT NULL,
  "productName" character varying(255) NOT NULL,
  "productStockAmount" real NOT NULL,
)

CREATE TABLE main."productPrices"
(
  id serial NOT NULL,
  product_id integer NOT NULL,
  "productPriceValue" real NOT NULL,
  "productPriceValidFrom" timestamp without time zone NOT NULL,
)

CREATE TABLE main."productTaxes"
(
  id serial NOT NULL,
  product_id integer NOT NULL,
  "productTaxValidFrom" timestamp without time zone NOT NULL,
  "taxRate_id" integer NOT NULL,
)

CREATE TABLE main."taxRateValues"
(
  id integer NOT NULL,
  "taxRate_id" integer NOT NULL,
  "taxRateValueValidFrom" timestamp without time zone NOT NULL,
  "taxRateValue" real,
)

I built a view based on the following query to get the currently relevant values:

SELECT p.id, p."productName", p."productStockAmount", sub."productPriceValue", CHR(64+sub3."taxRate_id") AS taxRateId, sub3."taxRateValue" FROM main."products" p 
CROSS JOIN LATERAL (SELECT * FROM main."productPrices" pp2 WHERE pp2."product_id"=p."id" AND pp2."productPriceValidFrom" <= NOW() ORDER BY pp2."productPriceValidFrom" DESC LIMIT 1) AS sub 
CROSS JOIN LATERAL (SELECT * FROM main."productTaxes" pt WHERE pt."product_id"=p."id" AND pt."productTaxValidFrom" <= NOW() ORDER BY pt."productTaxValidFrom" DESC LIMIT 1) AS sub2 
CROSS JOIN LATERAL (SELECT * FROM main."taxRateValues" trv WHERE trv."taxRate_id"=sub2."taxRate_id" AND trv."taxRateValueValidFrom" <= NOW() ORDER BY trv."taxRateValueValidFrom" DESC LIMIT 1) AS sub3 

This works fine and gives me the correct results but I assume to get performance problems if several thousand products, price changes etc. are in the database.

Is there anything I can do to simplify the statement or the overall database design? To use words to describe the needed flexibility:

  • Prices can be changed and I have to record which price is valid to which time (archival, so not only the current price is needed)
  • Applied tax rates for products can be changed (e.g. due to changes by law) - archival also needed
  • Tax rates in general can be changed (also by law, but not related to a single product but all products with this identifier)

Some examples of things that can happen:

  • Product X changes price from 100 to 200 at 2014-05-09
  • Product X changes tax rate from A to B at 2014-07-01
  • Tax rate value for tax rate A changes from 16 to 19 at 2014-09-01
Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
fkerber
  • 1,032
  • 9
  • 24
  • A very nice question with table definitions and a clear description and rationale. Clear without further editing. That's how it's done. (Only your version of Postgres is missing.) – Erwin Brandstetter Jun 12 '14 at 18:53
  • Thank you for your feedback. For reasons of completeness, I added the Postgres version ;) – fkerber Jun 12 '14 at 22:48

2 Answers2

3

As long as you fetch all rows or more than a few percent of all rows, it will be substantially faster to first aggregate once per table, and then join.

I suggest DISTINCT ON to pick the latest valid row per id:

SELECT p.id, p."productName", p."productStockAmount"
      ,pp."productPriceValue"
      ,CHR(64 + tr."taxRate_id") AS "taxRateId", tr."taxRateValue"
FROM   main.products p 
LEFT   JOIN  (
   SELECT DISTINCT ON (product_id)
          product_id, "productPriceValue"
   FROM   main."productPrices"
   WHERE  "productPriceValidFrom" <= now()
   ORDER  BY product_id, "productPriceValidFrom" DESC
   ) pp ON pp.product_id = p.id
LEFT   JOIN (
   SELECT DISTINCT ON (product_id)
          product_id, "taxRate_id"
   FROM   main."productTaxes"
   WHERE  "productTaxValidFrom" <= now()
   ORDER  BY product_id, "productTaxValidFrom" DESC
   ) pt ON pt.product_id = p.id
LEFT   JOIN (
   SELECT DISTINCT ON ("taxRate_id") *
   FROM   main."taxRateValues"
   WHERE  "taxRateValueValidFrom" <= now()
   ORDER  BY "taxRate_id", "taxRateValueValidFrom" DESC
   ) tr ON tr."taxRate_id" = pt."taxRate_id";

Using LEFT JOIN to be on the safe side. Not every product might have entries in all sub-tables.

And I subscribe to what @Clodoaldo wrote about double-quoted identifiers. I never use anything but legal, lower-case names. Makes your life with Postgres easier.

Detailed explanation for DISTINCT ON:
Select first row in each GROUP BY group?

Community
  • 1
  • 1
Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
  • Thank you very much - the query is apparently working exact the same way as the original one, but it would be great if you could elaborate a little further why it is doing so. Especially, how it could be assured that always the correct entry from the sub-tables is chosen. Or is it as easy as "ORDER BY is executed first and DISTINCT simply takes the first entry it sees"? – fkerber Jun 11 '14 at 21:36
  • @fkerber: I added a link to detailed explanation for `DISTINCT ON`. And yes, it's about that simple. The `DISTINCT` step is after aggregation (`GROUP BY` - not present in this query), `ORDER BY` and window functions (`OVER` clause - also not present in this query). – Erwin Brandstetter Jun 11 '14 at 21:41
0

Do not create quoted identifiers. Once you do it you are forever stuck with them and you will have to quote and remember the casing everywhere. You can use camel case whenever you want if you don't quote the identifier at creation time.

I don't understand why you need the cross lateral. I think it can be just

select
    p.id,
    p."productName",
    p."productStockAmount",
    pp2."productPriceValue",
    chr(64 + trv."taxRate_id") as "taxRateId",
    trv."taxRateValue"
from
    main."products" p
    left join (
        select *
        from main."productPrices"
        where "productPriceValidFrom" <= now()
        order by "productPriceValidFrom" desc
        limit 1
    ) pp2 on pp2."product_id" = p."id"
    left join (
        select "taxRate_id"
        from main."productTaxes"
        where "productTaxValidFrom" <= now()
        order by "productTaxValidFrom" desc
        limit 1
    ) pt on pt."product_id" = p."id"
    left join (
        select *
        from main."taxRateValues"
        where "taxRateValueValidFrom" <= now()
        order by "taxRateValueValidFrom" desc
        limit 1
    ) trv on trv."taxRate_id" = pt."taxRate_id"
Clodoaldo Neto
  • 118,695
  • 26
  • 233
  • 260
  • I'm not that familiar with Postgres, but won't the limits in the sub-queries reduce the result set before the join is applied? Rather than get the a single row for for the corresponding ID, you'd get a single row across the entire table. – Allan Jun 11 '14 at 20:18
  • @Allan: More importantly, a *single* tax rate won't cut it. We need the latest row for *each* tax rate (in use), etc. – Erwin Brandstetter Jun 11 '14 at 20:48
  • I think you are both correct. In fact, the code above does not bring the expected results. Regarding the camel case thing, the database tables have been auto-created initially by Laravel (using Eloquent ORM), so I normally don't work directly with the database. – fkerber Jun 11 '14 at 20:54
  • @Erwin Oh I missed that. That's is why the cross lateral was working for him – Clodoaldo Neto Jun 11 '14 at 21:05
  • @Allan: I see now your comment was hinting in the same direction. – Erwin Brandstetter Jun 11 '14 at 21:51