I would like to store a large amount of data in a relational database (postgresql on RDS in my case) but I am not sure what structure to use. I followed a database course including database normalization but I am still not sure about how to structure the database.
My data consists of tabular results of a climate model, aggregated by zones (water basins).
parameter in [temperature,precipitation .. etc.]
length = 11
year in [1960 : 2014]
month in [1:12]
zone (basin) [1 : 16000]
Option 1
My first option is to create a separate table for each indicator and store the data as follows:
| ID | basin_id | temperature | unit | year | month | temporal_resolution |
|-------|----------|-------------|------|------|-------|---------------------|
| 1 | 1 | 42.1 | k | 2000 | 1 | month |
| 2 | 2 | 1.87 | k | 2000 | 1 | month |
| .. | .. | .. | .. | .. | .. | .. |
| 11001 | 1 | 40.3 | m3 | 2000 | 2 | month |
| 11002 | 2 | 2.3 | m3 | 2000 | 2 | month |
Option 2
and the second option creates a vertical table:
| ID | basin_id | indicator | value | unit | year | month | temporal_resolution |
|-------|----------|-------------|-------|------|------|-------|---------------------|
| 1 | 1 | temperature | 42.1 | k | 2000 | 1 | month |
| 2 | 2 | temperature | 1.87 | k | 2000 | 1 | month |
| .. | .. | .. | .. | .. | .. | .. | .. |
| 11001 | 1 | precipitation | 40.3 | m3 | 2000 | 2 | month |
| 11002 | 2 | precipitation | 2.3 | m3 | 2000 | 2 | month |
My question is whether splitting up the indicator name and value is recommended or should not be used. If the data is stored vertically the total number of rows will be appr. 16000*11*12*55=116,160,000 which I am not sure is very manageable.
Option 3
EDIT: since the number of indicators is limited (12 or so), a vertical table structure is not required. A third option is merging the different indicator tables into something like:
| ID | basin_id | temperature_k | precipitation_m | … | riverdischarge_m3 | year | month | temporal_resolution |
|-------|----------|---------------|-----------------|----|-------------------|------|-------|---------------------|
| 1 | 1 | 42.1 | 42.1 | … | 42.1 | 2000 | 1 | month |
| 2 | 2 | 42.1 | 42.1 | | 42.1 | 2000 | 1 | month |
| .. | .. | .. | .. | .. | .. | .. | .. | .. |
| 11001 | 1 | 42.1 | 42.1 | .. | 42.1 | 2000 | 2 | month |
| 11002 | 2 | 42.1 | 42.1 | .. | 42.1 | 2000 | 2 | month |
This results in row_count of 16000 * 55 * 12 = 10,560,000