1

I have a newsequentialid() as a GUID Column and I wanted to know if I should index the column as Clustered or Non-Clustered Index as I will be using the GUID to query on.

I will also be inserting data into the table every week.

Jon Jaussi
  • 1,298
  • 3
  • 18
  • 36
Djbril
  • 745
  • 6
  • 26
  • 48
  • If you don't already have a clustered index then might as well use it up. – paparazzo Dec 29 '13 at 15:04
  • 1
    The answer given has lots of good info. I certainly also like to ask people if they're very sure they need a GUID... are you sure? :) They have a place, but it is uncommon. If you don't need an identifier unique around the planet, you might not need a GUID. – Mike M Dec 31 '13 at 19:28

1 Answers1

0

In general, yes put a clustered index on your table. But which columns to use as your clustered index? Please see this answer. Should point you in the right direction. Especially the link to GUIDs as PRIMARY KEYs and/or the clustering key. Much depends on the nature of your GUID. To paraphrase Kimberley Tripp's article: a GUID that is not sequential can be a bad choice. Examples:

  • Generated in the client (using .NET)
  • Generated by the newid() function

As there is quite a bit more to this decision than just that, I recommend you read the full answer provided by @marc_s and that entire article by Kimberley Tripp.

As far as the weekly data insert via SSIS consider these steps:

  • Drop your target table's indexes prior to the insert
  • Load your target table with a Data Flow
  • Configure your OLE DB Destination with "Data access mode:" = "Table or view - fast load"
  • Rebuild your target table's indexes after the insert

Depending on your specific situation what this can achieve is a dramatically faster insert and leave you with a clean index whose statistics are up to date and fragmentation is zero or near zero.

I realize your question states you are using newsequentialid(). While I was reading this article on TechNet the actual sequential nature of the values produced by this function comes with a caveat: After restarting Windows, the GUID can start again from a lower range, but is still globally unique. Would this impact your downstream usage of the GUID?

I guess I am wondering why you find it necessary to use a GUID and why that is a better choice in your situation as opposed to an integer based SEQUENCE which would make your primary key clustered index significantly smaller.

Community
  • 1
  • 1
Jon Jaussi
  • 1,298
  • 3
  • 18
  • 36