2

I have a database that will be used by multiple clients (local installs), the plan is to then copy the data to Azure to allow global reporting etc.

The database will be using GUIDs for it's primary keys.

What should I use for a clustered index on the tables, or does that not matter when adding data to Azure? Do I even really need a clustered index? Azure will have single copy of the database, with all client data in it if that makes a difference.

thanks all.

astaykov
  • 30,768
  • 3
  • 70
  • 86
Jay
  • 2,077
  • 5
  • 24
  • 39
  • thanks for the answers so far. Does anyone know what happens if there is a collision in the clustered index? Assuming I use an int Identity column as the index, collisions when data is uploaded to Azure are almost guaranteed. – Jay Sep 19 '12 at 09:09
  • As long as the clustered index is not also the primary then you shouldn't have any problems. – knightpfhor Sep 19 '12 at 09:33
  • A GUID index is going to fragment rapidly. For that reason I would not use GUID as a clustered index. Less overhead to fragment of a clustered versus non clustered. Consider Identity as PK clustered. And maybe not even index GUID. Yes an index will help select but it will be an expensive index to build. If you don't like Identity at least go with NEWSEQUENTIALID(). – paparazzo Sep 19 '12 at 21:33
  • Agreed, there are lots of problems with using GUID, but all of the problems caused by them are much easier to solve than reliably and efficiently creating sequential IDs in distributed databases. – knightpfhor Sep 23 '12 at 20:36
  • SQL Azure does NOT allow for NEWSEQUENTIALID(). I've seen many recommendations for using a datetime field, but I'm still in the air about that, as it seems rather pointless. – Mike Taber Oct 05 '12 at 02:19

2 Answers2

1

Although you are allowed to create (and have data in) a table without clustered index in SQL Server, this is not allowed in Windows Azure SQL Database ( WASD / SQL Azure). While you can have the table without clustered index in WASD as definition, no DML statement will be allowed to execute against such a table, i.e. you will not be able to do INSTERT/UPDATE/DELETE statements against table in WASD without clustered index. So, if by any chance data is going to the cloud, you should have a clustered index. For more info, check the Clustered Index Requirement in the Guildelines and limitations for Windows Azure SQL Database.

astaykov
  • 30,768
  • 3
  • 70
  • 86
1

Some of the recommendations here are incorrect.

  1. NEWSEQUENTIALID() is not allowed on SQL Azure.
  2. In SQL Azure, a clustered index is absolutely required. You can create a table without one, but you will not be able to add data to it until after you add the clustered index.

In Azure, the clustered index is used for their back-end replication. Reference: http://blogs.msdn.com/b/sqlazure/archive/2010/05/12/10011257.aspx

I think that your best bet is to use a column with an Identity element as the clustered index, along with a non-clustered index on your guid column. I ran into this exact same problem and after quite a bit of research, it's the solution I eventually came up with. It's a bit of a pain to put together, especially if you already have data in production on Azure, but it seems to be the one that addresses all of the issues.

I feel like it would be simplest to use a NEWSEQUENTIALID, but that isn't an option on Azure.

Mike Taber
  • 833
  • 6
  • 21