1

DB-Fiddle

CREATE TABLE sales (
    id SERIAL PRIMARY KEY,
    country VARCHAR(255),
    sales_date DATE,
    sales_volume DECIMAL,
    fix_costs DECIMAL
);

INSERT INTO sales
(country, sales_date, sales_volume, fix_costs
)
VALUES 

('DE', '2020-01-03', '500', '0'),
('NL', '2020-01-03', '320', '0'),
('FR', '2020-01-03', '350', '0'),
('None', '2020-01-31', '0', '2000'),

('DE', '2020-02-15', '0', '0'),
('NL', '2020-02-15', '0', '0'),
('FR', '2020-02-15', '0', '0'),
('None', '2020-02-29', '0', '5000'),

('DE', '2020-03-27', '180', '0'),
('NL', '2020-03-27', '670', '0'),
('FR', '2020-03-27', '970', '0'),
('None', '2020-03-31', '0', '4000');

Expected Result:

sales_date   |   country    |   sales_volume   |     fix_costs
-------------|--------------|------------------|------------------------------------------
2020-01-03   |     DE       |       500        |     37.95  (= 2000/31 = 64.5 x 0.59)
2020-01-03   |     FR       |       350        |     26.57  (= 2000/31 = 64.5 x 0.41)
2020-01-03   |     NL       |       320        |      0.00
-------------|--------------|------------------|------------------------------------------
2020-02-15   |     DE       |         0        |     86.21  (= 5000/28 = 172.4 x 0.50)  
2020-02-15   |     FR       |         0        |     86.21  (= 5000/28 = 172.4 x 0.50)  
2020-02-15   |     NL       |         0        |      0.00
-------------|--------------|------------------|------------------------------------------    
2020-03-27   |     DE       |       180        |     20.20  (= 4000/31 = 129.0 x 0.16) 
2020-03-27   |     FR       |       970        |    108.84  (= 4000/31 = 129.0 x 0.84)   
2020-03-27   |     NL       |       670        |      0.00
-------------|--------------|------------------|-------------------------------------------

With reference to the solution in this question I tried to apply the following query to get the expected result:

SELECT 
    sales_date, 
    country, 
    SUM(sales_volume),
       (CASE WHEN country = 'NL' THEN 0
             WHEN SUM(CASE WHEN country <> 'NL' THEN sales_volume END) OVER (PARTITION BY sales_date) > 0
             THEN ((f.fix_costs/ DAY(LAST_DAY(sales_date))) *
                   sales_volume / NULLIF(SUM(CASE WHEN country <> 'NL' THEN sales_volume END) OVER (PARTITION BY sales_date), 0)
                  )
             ELSE (f.fix_costs / DAY(LAST_DAY(sales_date))) * 1 / SUM(country <> 'NL') OVER (PARTITION by sales_date)
        END) AS imputed_fix_costs
FROM sales s

    CROSS JOIN
     (SELECT 
      LAST_DAY(sales_date) AS month_ld, 
      SUM(fix_costs) AS fix_costs
      FROM sales
      WHERE country = 'None'
      GROUP BY month_ld
     ) f
      ON f.month_ld = LAST_DAY(s.sales_date)

WHERE country <> 'None'
GROUP BY 1,2;

This query worked without any issue in MariaDB.
However, now I switched to Postgres 9.5 and I am getting an error on the CROSS JOIN:

 ERROR:  syntax error at or near "ON"
 LINE 22:       ON f.month_ld = LAST_DAY(s.sales_date)

How do I need to modify the query to also make it work in Postgres 9.5?

padaleiana
  • 955
  • 1
  • 14
  • 23
Michi
  • 4,663
  • 6
  • 33
  • 83
  • 2
    A cross join doesn't need a join condition, so you need to remove the `ON` part –  Feb 23 '21 at 10:30
  • 1
    Unrelated to your problem, but: Postgres 9.5 is [no longer supported](https://www.postgresql.org/support/versioning/) you should plan an upgrade as soon as possible. If you are migrating, you should start with Postgres 13 –  Feb 23 '21 at 10:31
  • Thanks a lot for the additional hint with the old version. I also tried to remove the ON part but it still does not work: https://dbfiddle.uk/?rdbms=postgres_13&fiddle=9f9ab102726b20e4437b67c24ac4d61e – Michi Feb 23 '21 at 10:39
  • Could you please explain the fix_costs logic. Where das 0.59 comes from? Why is there a 0.00 for NL? – S-Man Feb 23 '21 at 10:41
  • If you want an `on`, use `join`, not `cross join`. – Gordon Linoff Feb 23 '21 at 11:52

3 Answers3

1
  1. There are no LAST_DAY() and DAY() functions in Postgres AFAIK.
  2. CROSS JOINs don't have a join condition.

TL;DR: After re-reading your code I find that my construction is surprisingly similar to yours. So, in fact, I guess there was no need for the huge explanation, after all. I am letting it there as documentation, nonetheless.

However, I guess, the most important part for your question is how to do the join. This is descibed in step 2 which I highlighted.


It is not clear, how in the second sales_date group the last fix_costs part comes to 0.5 (Percentage would yield DIVISION BY ZERO error...)

However, this query can demonstrate the way to the solution:

step-by-step demo:db<>fiddle

SELECT
    s1.*,
-- 3:
    COALESCE(
        s2.fix_costs / date_part('day', s2.sales_date) *

-- 1:
            CASE WHEN s1.country != 'NL' THEN s1.sales_volume END  
                / NULLIF(SUM(s1.sales_volume) FILTER (WHERE s1.country != 'NL')  OVER (PARTITION BY s1.sales_date), 0)
   , 0)       

FROM sales s1

-- 2:
JOIN sales s2 ON s1.country <> 'None' AND s2.country = 'None' 
    AND date_trunc('month', s1.sales_date) = date_trunc('month', s2.sales_date)

1a) SUM(...) OVER (PARTITION BY ...) window function returns the total of the sales_volumn values from the sales_date. This result is added as a new column to the result.

1b) Adding the FILTER clause removes the country = 'NL' records from this calculation

1c) NULLIF(..., 0) is to avoid DIVISION BY ZERO when calculating the percentage between the current sales_volumn and the previously calculated total (which would happen in the second sales_date group).

1d) CASE clause removes NL values from first part of division

1e) Calculate the percentage for the second part of your fix_costs value (as I understood it)

2a) Natural join which conditions: One table without None countries and the second only the None country records.

2b) date_trunc('month', ...) normalizes a date to the first of the current month. So the join condition is: Where the months of both tables fit (with date_trunc() the compared values all are YYYY-MM-01)

3a) date_part('day', ...) extracts the day from the date. Since you always added the last day of a month, it did not need to be calculated explicitly. So, with that the first part of your fix_costs value can be calculated.

3b) Multiply the both fix_costs parts

3c) COALESCE(..., 0) Because of filtering the NL records and the DIV BY ZERO error handling there are some NULL results in the fix_costs column, which is fixed with this function.

TODO: Adding the 0.5 part for the second group.


Additionally:

To calculate the last day of the current month in Postgres you have to do following:

date_trunc('month', the_date) + interval '1 month - 1 day'

date_trunc('month', ...) calculates the first day of month, as explained above. Then you add one month to get the first day of the next month. From this you can subtract one day to get the last day of the current month.

S-Man
  • 22,521
  • 7
  • 40
  • 63
0

You probably need to use INNER JOIN, not CROSS JOIN here. As it states here, "In MariaDB, CROSS JOIN is a syntactic equivalent to INNER JOIN (they can replace each other)."

In PostgreSQL, CROSS JOIN works differently and your expected behaviour simply doesn't match it - you probably don't want to match all your records with each other. Just replace "CROSS" with "INNER" and you almost ready to go.

But this code also has a problem with a MariaDB-specific function Last_Day, which was covered here.

0

DB-Fiddle

SELECT
s.sales_date, 
s.country,
s.sales_volume,
f.fix_costs,

 (CASE WHEN country = 'NL' THEN 0
       
       /* Exclude NL from fixed_costs calculation */
       WHEN SUM(CASE WHEN country <> 'NL' THEN sales_volume ELSE 0 END) OVER (PARTITION BY sales_date) > 0
       THEN ((f.fix_costs/ extract(day FROM (date_trunc('month', sales_date + INTERVAL '1 month') - INTERVAL '1 day'))) *
              sales_volume / 
              NULLIF(SUM(s.sales_volume) FILTER (WHERE s.country != 'NL')  OVER (PARTITION BY s.sales_date), 0)
              )
              
        /* Divide fixed_cots equaly among countries in case of no sale*/      
        ELSE (f.fix_costs / extract(day FROM (date_trunc('month', sales_date + INTERVAL '1 month') - INTERVAL '1 day')))
              / SUM(CASE WHEN country <> 'NL' THEN 1 ELSE 0 END) OVER (PARTITION by sales_date)
              
        END) AS imputed_fix_costs
        
FROM sales s

JOIN

  (SELECT 
  (date_trunc('MONTH', sales_date)) AS month_ld,
  SUM(fix_costs) AS fix_costs
  FROM sales
  WHERE country = 'None'
  GROUP BY month_ld) f ON f.month_ld = date_trunc('month', s.sales_date)

WHERE country NOT IN ('None')
GROUP BY 1,2,3,4
ORDER BY 1;
Michi
  • 4,663
  • 6
  • 33
  • 83