Working on a piece of software for thermodynamic simulation (in C++), I need to store fluid properties at different temperatures. These data points are used to build a simple interpolation function, in order to evaluate the values of the properties also in temperatures for which we don't have full experimental data.
A fluid is simply identified by its name and concentration (%) (the latter often irrelevant). If you're curious, four properties are of interest: mass density, dynamic viscosity, specific heat capacity and thermal conductivity. For any other intent and purpose, these are just 4 numeric properties, call them A, B, C an D.
So the function for a specific fluid looks something like this: (A,B,C,D) = f(T)
, where T is the temperature.
Currently, it is a SQLite3 database and the table of fluids looks like this:
+----+-------+---------------+
| id | name | concentration |
+====+=======+===============+
| 1 | Water | 100 |
+----+-------+---------------+
| 2 | ..... | ... |
And there is the properties table:
+----------+-------------+---------+-----------+--------------+----------+
| fluid_id | temperature | density | viscosity | conductivity | capacity |
+==========+=============+=========+===========+==============+==========+
| 2 | 373.15 | 1045.48 | 0.412 | 1.415 | 0.845 |
| 3 | 273.15 | 1105.0 | 2.113 | 0.4688 | 0.849 |
| 3 | 283.15 | | 1.678 | 0.4859 | 0.8503 |
| 3 | 293.15 | 1098.0 | 1.353 | 0.5015 | 0.5833 |
| 3 | 303.15 | | 1.08 | 0.5164 | |
| 3 | 313.15 | 1090.0 | 0.893 | 0.532 | 0.8561 |
| 3 | 323.15 | | 0.748 | 0.5432 | |
| 3 | 333.15 | 1080.0 | 0.644 | 0.5543 | 0.8577 |
| 3 | 343.15 | | 0.563 | 0.564 | |
| 3 | 353.15 | 1068.0 | 0.499 | 0.5722 | 0.8612 |
| 3 | 363.15 | | 0.44 | 0.5796 | |
| 3 | 373.15 | 1054.0 | 0.39 | 0.5856 | |
+----------+-------------+---------+-----------+--------------+----------+
Inserting data manually for testing was OK. This would also be an intuitive display for a Fluid Editor GUI later on.
In code, however, the interpolation is done separately for each property. Moreover, since I can't use the NULL values, not all temperatures (rows) are relevant for all properties. To adapt things to the code's point of view, I created four identical views - one for each property. For example:
+----+-----------+---------------+-------------+-------+
| id | name | concentration | temperature | value |
+====+===========+===============+=============+=======+
| 2 | Sea Water | 22 | 373.15 | 0.412 |
| 3 | Sea Water | 14 | 273.15 | 2.113 |
| 3 | Sea Water | 14 | 283.15 | 1.678 |
| 3 | Sea Water | 14 | 293.15 | 1.353 |
| 3 | Sea Water | 14 | 303.15 | 1.08 |
| 3 | Sea Water | 14 | 313.15 | 0.893 |
| 3 | Sea Water | 14 | 323.15 | 0.748 |
| 3 | Sea Water | 14 | 333.15 | 0.644 |
| 3 | Sea Water | 14 | 343.15 | 0.563 |
| 3 | Sea Water | 14 | 353.15 | 0.499 |
| 3 | Sea Water | 14 | 363.15 | 0.44 |
| 3 | Sea Water | 14 | 373.15 | 0.39 |
+----+-----------+---------------+-------------+-------+
Now as I'm gradually moving over from prototyping to building the proper software, I'm trying to think how either of these approaches would fit into an ORM perspective. Would it be a model for each property (like my views) or a single model for all properties (like the table currently in use). A third alternative might be to leave the database as it is and build models on top of views (instead of real tables), but this isn't the way of ORMs.
I even considered moving this dataset to a NoSQL solution (e.g. MongoDb), but I couldn't think of a way to overcome the double-perspective issue.
I admit that there is neither runtime nor space performance issue here, and that the volume of data to be stored and processed is negligible anyway. There might be as low as two queries in a hour, each loading the dataset for a specific fluid into a application's memory and working with it there (interpolation and calculations based on evaluations). So I'd accept it if you think I'm over-stressing about this.
Otherwise, I'd like to hear your thoughts and consider any different approach you might offer. Am I missing something? What about the redundancy of keys (fluid and temperature) that splitting the table would incur? Also, this might be of interest to others who do have set constraints.