I have a project in postgresql to create a DWH from multiple schemas. The schemas have the same structure(tables, constrains). I am stuck on this problem : I have a table "employees" in all the schemas that i will use in DWH. The first entry in schema1 table employees is 1 John Dow, the first entry in schema2 table employees is 1 Mary Jane (the id column is auto increment). Is there a way to create a DWH with this structure ? Is there a way to partition the schemas data ? Thank you.
Asked
Active
Viewed 191 times
1
-
is there any reason why you can't simply combine all this employee data into one table, add a field that captures the name of the source schema (if you need it), and then assign a new Id to each record? – RADO Jun 05 '17 at 03:51
2 Answers
0
Recompute the primary key as
<id> * <# of schemas> + <schema number>
For example, if you have three schemas, number 7 from the first schema would become 7 * 3 + 1 = 22, and number 5 from the third schema would become 5 * 3 + 3 = 18.
That way you can easily translate all numbers in a collision-free fashion.

Laurenz Albe
- 209,280
- 17
- 206
- 263
-
Thank you for your response. My problem is that my schema number is dynamic. It's for an app and every time a client register's an account with us, the app generates a schema for that client. – k4br4s Jun 08 '17 at 07:35
0
The easiest way will probably be to assign a number of the most significant bits of the key to become a source ID, thereby making your keys unique across all of your source databases. E.g. Let's assume you will never have more than 100 databases that you're warehousing, so a 7 bit database identifier will be sufficient.
- Ensure all tables in your database that have sequential integer primary keys, or foreign keys referencing those, use the same data type. This data type must be large enough to hold values for all rows on any table without touching the most significant 8 bits. So
INTEGER
(32 bit) keys would leave 24 bits remaining allowing up to 16777215 insertions, andBIGINT
(64 bit) keys will leave 56 bits remaining, allowing up to 72057594037927935 insertions. (Beware thatALTER TABLE ... ALTER COLUMN SET DATA TYPE
requires a full table rewrite and can be very slow for high volume tables.) - Assign each DB a unique, 7 bit database identifier that will occupy the most significant 8 bits of each key. You only get 7 bits, not 8, because the top bit is the signing bit, and you can't touch that.
- For every integer primary key and foreign key referencing it in the database, update it, adding
DB_ID << 24
(for 32 bit keys) orDB_ID << 56
(for 64 bit keys) to it. E.g. For a DB with ID 13 and 32 bit keys, you would add 218103808 to every key value. - Optionally, add
CHECK CONSTRAINT
s to each table, ensuring their key is in the range[(DB_ID << 24) + 1, ((DB_ID+1) << 24) - 1]
. - Restart each table's PK sequence using:
ALTER SEQUENCE :seq_name MINVALUE :min_val MAXVALUE :max_val START WITH :start_val RESTART WITH :start_val
, wheremin_val = (DB_ID << 24) + 1
,max_val = ((DB_ID+1) << 24) - 1
, andstart_val = currval(SEQUENCE_NAME)
.

David Scarlett
- 3,171
- 2
- 12
- 28
-
Thank you for your response. I'm on my way to get this done. Thank you once again. Have a great day ! – k4br4s Jun 08 '17 at 07:32