I have a table in a Postgres database with monthly columns from 2012 to the end of 2018:
create table sales_data (
part_number text not null,
customer text not null,
qty_2012_01 numeric,
qty_2012_02 numeric,
qty_2012_03 numeric,
...
qty_2018_10 numeric,
qty_2018_11 numeric,
qty_2018_12 numeric,
constraint sales_data_pk primary key (part_number, customer)
);
The data is populated from a large function that pulls data from an extremely wide variety of sources. It involves many left joins -- for example, in combining history with future data, where a single item may have history but not future demand or vice versa. Or, certain customers may not have data as far back or forward as we want.
The problem I'm coming up with is due to the left joins (and the nature of the data I'm pulling), a significant number of the values I am pulling are null. I would like any null to simply be zero to simplify any queries against this table, specifically aggregate functions that say 1 + null + 2 = null.
I could modify the function and add hundreds of coalesce statements. However, I was hoping there was another way around this, even if it means modifying the values after the fact. That said, this would mean adding 84 update statements at the end of the function:
update sales_data set qty_2012_01 = 0 where qty_2012_01 is null;
update sales_data set qty_2012_02 = 0 where qty_2012_02 is null;
update sales_data set qty_2012_03 = 0 where qty_2012_03 is null;
... 78 more like this...
update sales_data set qty_2018_10 = 0 where qty_2018_10 is null;
update sales_data set qty_2018_11 = 0 where qty_2018_11 is null;
update sales_data set qty_2018_12 = 0 where qty_2018_12 is null;
I'm missing something, right? Is there an easier way?
I was hoping the default
setting on the column would force a zero, but it doesn't work when the function is explicitly telling it to insert a null. Likewise, if I make the column non-nullable, it just pukes on my insert -- I was hoping that might force the invocation of the default.
By the way, the insert-then-update strategy is one I chastise others for, so I understand this is less than ideal. This function is a bit of a beast, and it does require some occasional maintenance (long story). My primary goal is to keep the function as readable and maintainable as possible -- NOT to make the function uber-efficient. The table itself is not huge -- less than a million records after all is said and done -- and we run the function to populate it once or twice a month.