We have a database that stores large amounts of measurement data in the following way.
- There's a table called "Instrument", where every intrument has as primary key of three columns [CustomerCode | LocationCode | InstrumentCode]. Those colums a of type VARCHAR(4), so a complete primary key looks like ABC | L001 | S001. Those codes are meaningfull to us, so we can't just change the for integers. Other relations ar defined on those columns as well but they are out of the scope for this question. This table holds about 200.000 rows representing different measurement data points.
- There's a table called "InstrumentLoggings" where every InstrumentLogging has a primary key of four columns [CustomerCode | LocationCode | InstrumentCode | Timestamp]. The Timestamp column is of type DateTime. A Foreign Key relation is defined on the first three columns to the "Instrument" table. Then there's a fifth field of type VARCHAR(25) containing a value for that timestamp. This tables holds about 5 billion records (is this outrageous or is it not bad at all?).
Here's a short diagram of the current situation:
Our issue is that the InstumentLoggings table grows to over 200GB, and the performance is starting to decrease. Also back-up and restoring is too much time-consuming. We are looking for ways to eliminate all those primary key fields to a single field in the InstrumentLoggings table.
Can i simply put an extra column InstrumentId on the Instrument table, and create a InstrumentLoggings table with only three colums [InstrumentId | Timestamp | Value ] where the primary key consist of the InstrumentId and Timestamp columns?? Or is it better for the performance to add an extra column InstrumentLoggingId to the previous idea?
In the following image you can see the logging table as is is now, and the two alternatives. I am very curious about your thoughts and if there are any alternatives i do not see right now...