This is similar to this post, but different enough that I'm asking again. The proposed solutions would also be impractically slow in my case.
Note: I am also working with really large datasets in Redshift.
I am trying to calculate "scores" for a statistical model, which requires that I calculate a dot product for every row in a data set. In particular, I would like to create a new column in my table "SCORE" that multiplies the values for each column in the first table by a pre-specified value for that column.
For example, I have the following tables:
INITIAL
+--------+------+-------+
| NAME | COL0 | COL1 |
+--------+------+-------+
| AL | 1 | 4 |
| BILL | 2 | 5 |
| CATHY | 3 | 6 |
+--------+------+-------+
ESTIMATES
+--------+----------+
| NAME | ESTIMATE |
+--------+----------+
| COL0 | 5 |
| COL1 | 10 |
+--------+----------+
and I want a final table,
FINAL
+--------+------+-------+-------+
| NAME | COL0 | COL1 | SCORE |
+--------+------+-------+-------+
| AL | 1 | 4 | 45 |
| BILL | 2 | 5 | 60 |
| CATHY | 3 | 6 | 75 |
+--------+------+-------+-------+
For the final table, the values for each column in the INITIAL table is multiplied by a different number depending on the values listed in the ESTIMATES table For example, the SCORE for AL is derived from 1 * 5 + 4 * 10 = 45. Where the 5 and 10 come from the estimates table.
Currently, I am creating the SCORE column by writing SQL code that manually writes in the numbers from the estimates.