0

I am migrating a MYSQL db to MSSQL.
I have a MYSQL column with datatype varchar(8192).
MSSQL varchar(8000) seems like the best in terms of performance but there could be data that's between 8000 and 8192.
I am curious as to what the most performance efficient MSSQL datatype would be to use for this.
Is my only way to look at this is asking whether I have data that's actually between 8000 and 8192 and how often it is looked up?
If its just a few rows, then use MSSQL Varchar(max) else if a lot then what do I do?

AFAIK, if I have a table with rows where data is MSSQL varchar < 8000, those will be treated as rows, and any > 8000 will be lob?
Will MSSQL have different search strategies, like first search all rows where size < 8000 first and then search the other place for where the data is > 8000?
If that's the case what if I have 1000 rows of < 8000 and 1 row > 8000 which is first in the database, wont that be a greater performance hit because it would have went through 1001 rows. Indexing should help, is it advised?

Zohar Peled
  • 79,642
  • 10
  • 69
  • 121
kwm
  • 29
  • 4
  • What's wrong with `VARCHAR(max)`? – tadman Jan 31 '17 at 08:41
  • Possible duplicate of [SQL performance: Is there any performance hit using NVarchar(MAX) instead of NVarChar(200)](http://stackoverflow.com/questions/4378795/sql-performance-is-there-any-performance-hit-using-nvarcharmax-instead-of-nva) – e4c5 Jan 31 '17 at 09:17
  • 1
    @tadman Afaik, (n) is saved to a different location than (max) is. Found this article which kind of gives good clarification on what to expect. https://www.simple-talk.com/sql/database-administration/whats-the-point-of-using-varcharn-anymore/ – kwm Jan 31 '17 at 11:47
  • @e4c5 Thanks, did not see that one, but it kind of reaffirms whats in the link at simple-talk.com – kwm Jan 31 '17 at 11:49
  • If today you have between 8000 and 8192, tomorrow you will have more than 8192. Go with CLOB. – Rick James Jan 31 '17 at 18:47
  • @kwm It is odd that SQL server allows ranges from 1 to 8000, but beyond that it's billions. So much for control! It implies that `max` is stored like some kind of irregular blob, though, but if you've got >8000 chars of data to store you're stuck with it. – tadman Jan 31 '17 at 23:20
  • @Rick. Data length will never exceed 8192, it will mostly be under 8000 but there are those cases where it is between 8000 and 8192. I decided to go with MSSQL Varchar(MAX) – kwm Feb 01 '17 at 07:00
  • @tadman, yes agreed, < 8000 is stored differently to > 8000 but max is way over and beyond 8000. I will use MSSQL varchar(max). I do wonder why and in so found this article which kind of explains it better: http://www.teratrax.com/varchar-max-sql-server/ Not sure if applicable to later sql versions but i think it does as i am using Microsoft SQL Server 2016 (RTM) - 13.0.1601.5 (X64) – kwm Feb 01 '17 at 07:07
  • I think MAX is fine, if the data is greater than 8000 it will store a pointer to the full data. so my searches shouldn't be any slower unless the data i need finds a pointer and it needs to then get the full data. i have so few records between 8000 and 8192 that in my case i think its perfectly fine. – kwm Feb 01 '17 at 07:22
  • If you have a solution you can confirm as working you may want to add an answer of your own here to show how you fixed it. – tadman Feb 01 '17 at 08:08

0 Answers0