I need to create a table (postgresql 9.1) and I am stuck. Could you possibly help?
The incoming data can assume either of the two formats:
- client id(int), shop id(int), asof(date), quantity
- client id(int), , asof(date), quantity
The given incoming CSV template is: {client id, shop id, shop type, shop genre, asof, quantity}
In the first case, the key is -- client id, shop id, asof
In the second case, the key is -- client id, shop type, shop genre, asof
I tried something like:
create table(
client_id int references...,
shop_id int references...,
shop_type int references...,
shop_genre varchar(30),
asof date,
quantity real,
primary key( client_id, shop_id, shop_type, shop_genre, asof )
);
But then I ran into a problem. When data is of format 1, the inserts fail because of nulls in pk.
The queries within a client can be either by shop id, or by a combination of shop type and genre. There are no use cases of partial or regex matches on genre.
What would be a suitable design? Must I split this into 2 tables and then take a union of search results? Or, is it customary to put 0's and blanks for missing values and move along?
If it matters, the table is expected to be 100-500 million rows once all historic data is loaded.
Thanks.