For the project I'm working on, we need the customer data in a database nearby the customer. For this reason we have adopted the new Elastic Scale solution from Microsoft. This takes away the complexity of sharding and still gives you the power to scale worldwide.
At the moment I'm facing a rather important problem. I need to migrate data from 1 shard to another. There is an example application (Merge/Split) which does something, but it works with Ranges (1..100, 101..400, etc.). The database I'm working on works with Guids, so we can't use the example code.
I've created a Move/Merge management tool myself, but facing an issue here.
At first I wanted to insert all the objects and dependencies with the ORM. Due to some circular keys I'm not able to do this easily. Therefore I'm creating a SQL script now.
The SQL script is about 130MB and only contains INSERT
commands.
All of this has to be done within 1 transaction, because you don't want the migration to be done half. If there's an error, everything should be rolled back.
Running this 130MB script gives me some errors. My local development machine and SQL Azure I'm running out of memory. SQL Azure:
There is insufficient memory available in the buffer pool
and locally:
There is insufficient system memory in resource pool 'default' to run this query.
I've tried disabling the indices, so this won't get rebuild on every INSERT
. This didn't fix anything.
Any suggestions on how to proceed? I can't really split the script, because all of the data has to be INSERT
ed at once. A SSIS package isn't an option either I think.
Creating my own database transaction system appears to be a lot of overkill and error prone.
Besides the INSERT
script, I also need to execute a DELETE
script on the 'old' shard/database, so I guess the solution has to work for this script also. I would love to do the INSERT
and DELETE
scripts in 1 transaction, but that's not (yet) possible on SQL Azure (distributed transactions).