3

I am trying to estimate database size for SQL Server 2008 R2. I have a table with one INTEGER primary key and 39 text columns of type VARCHAR(MAX).

I have searched and found two statements.

  • A table can contain a maximum of 8,060 bytes per row.
  • Varchar(max) has a maximum storage capacity of 2 gigabytes.

I am confused to get estimate the size. How can I store 2 gigabytes in each column if there is a limit on row?

I am not database expert may be I am not getting it correctly.

Can anyone explain How to estimate it?

Thank you

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
Shoaib Ijaz
  • 5,347
  • 12
  • 56
  • 84
  • 2
    Have you seen this link http://stackoverflow.com/questions/7892334/get-size-of-all-tables-in-database I hope that may can help you. – Timoshenko Apr 23 '16 at 00:58
  • Do you want to get the current size, or are you looking to find how large it could be given a certain number of rows added? – Solomon Rutzky Apr 23 '16 at 06:13
  • The "large" datatypes like `VARCHAR(MAX)` or `VARBINARY(MAX)` are handled differently from the other datatypes. Since they can have up to 2 GB of data, they might end up spanning over a whole lot of "overflow" pages. In that case, they will take up a certain amount of storage on the "base" data page (it's documented somewhere - don't know right now), while the other columns for `int`, `datetime` etc. take up a fixed amount of storage on the base page – marc_s Apr 23 '16 at 07:28
  • @srutzky I am trying to find out how large it could be. – Shoaib Ijaz Apr 23 '16 at 07:31
  • Depends what the columns are going to store. One row could store 140 bytes while another could store 78 gigabytes depending on the data. – Sean Pearce Apr 23 '16 at 08:28

2 Answers2

3

In Microsoft SQL Server, data (which includes indexes) are stored in one or more 8k (8192 bytes) "pages". There are different types of pages that can be used to handle various situations (e.g. Data, LOB, Index, AllocationMap, etc) . Each page has a header which is meta-data about that page and what it contains.

Most data is stored in the row itself, and one or more of these rows are in turn stored in a page for "in-row data". Due to the space taken by the row header, the largest a row can be (for "in-row" data) is 8060 bytes.

However, not all data is stored in the row. For certain datatypes, the data can actually be stored on a "LOB data" page while a pointer is left in the "in-row" data:

  • Legacy / deprecated LOB types that nobody should be using anymore (TEXT, NTEXT, and IMAGE), by default, always store their data on LOB pages and always use a 16-byte pointer to that LOB page.

  • The newer LOB types (VARCHAR(MAX), NVARCHAR(MAX), VARBINARY(MAX), and XML), by default, will attempt to fit the data directly in the row if it will fit. Else it will store the data on LOB pages and use a pointer of 24 - 72 bytes (depending on the size of the LOB data).

This is how you could store up to 78 GB + 4 bytes (can't forget about the INT Primary Key ;-) in a single row: the max row size will be between 940 bytes ((39 * 24) + 4) and 2812 bytes ((39 * 72) + 4). But again, that is just the maximum range; if the data in each of the 39 VARCHAR(MAX) fields is just 10 bytes, then all of the data will be stored in-row and the row size will be 394 bytes ((39 * 10) + 4).

Given that you have so many variable-length fields (whether they are MAX or not), the only way to estimate the size of future rows is to have a good idea about what data you will be storing in this table. Although, a table with all, or even mostly, MAX datatypes implies that nobody really has any idea what is going to be stored in this table.

Along those lines, it should be pointed out that this is a horribly modeled table / horrible use of MAX datatype fields, and should be refactored.

For more details about how data pages are structured, please see my answer to the following DBA.StackExchange question:

SUM of DATALENGTHs not matching table size from sys.allocation_units

Community
  • 1
  • 1
Solomon Rutzky
  • 46,688
  • 9
  • 128
  • 171
  • Soloman, Do you have a link that describes the 24-72 bytes of overhead that you speak of? I ask because I did and bit of a test and found that's very true and that it's not just the 16 byte pointer that gets stored. – Jeff Moden Jul 31 '19 at 15:11
  • Hi @JeffModen . Right, 16 byte pointers are mostly just for the deprecated LOB types (`IMAGE`, `TEXT`, and `NTEXT`), though that's not always true if using `sp_tableoption` to set `text in row` (not sure how many people even know of that option). Or, another infrequent scenario is using `sp_tableoption` to set `large values out of row` for the `MAX` types, which causes them to always use a 16-byte pointer. But please review link at the end of my answer as it points to a more detailed answer of mine that I just updated with more details on this, along with an external reference and test script. – Solomon Rutzky Jul 31 '19 at 21:00
  • Thanks Soloman. I found a link by Mark S. Rasmussen over http://improve.dk/what-is-the-size-of-the-lob-pointer-for-max-types-like-varchar-varbinary-etc/ – Jeff Moden Aug 01 '19 at 00:20
  • @JeffModen So, you didn't look at my other answer linked at the bottom of this answer? ;-) Not only is that same link in that other answer, but I have added some details not mentioned in Mark's blog post (and a link to a test script that walks you through most of it). Also, I found that the upper-bound of the Inline-Root was 42,000 while Mark's post states that the limit is only 40,000. This answer here was just a summary based on parts of that other answer. Please review that other answer (the "SUM of DATALENGTHs..." link). – Solomon Rutzky Aug 01 '19 at 01:34
  • No... I sure didn't. There was nothing in the title of the link you're referring to indicate there was anything I was looking for and the description above the link said that it was for more information on the structure of data pages, which I already know at the page level. I just needed a quick answer to the 24-72 question. Since you knew where the 24-72 number came from, it would have been nice if you posted a direct link on this thread. Now looking at that post, nice article, Solomon. – Jeff Moden Aug 01 '19 at 23:01
  • @JeffModen Sorry, wasn't meaning to be difficult. I just figured that there was too much info to put into a comment here, and that the link, while a great resource, didn't cover everything that I mentioned in that other answer. And I think that my test script covered a scenario or two that were not covered in Mark's post. And thanks, hopefully you got the info you were looking for. – Solomon Rutzky Aug 02 '19 at 18:14
1

When you use Varchar(MAX), data can be stored within the row(called a page) (if contents are <8000 bytes). If the contents are >8000 bytes, data is stored as a LOB ("off the page"), and only a reference to the actual location is stored within the page. I honestly don't know of any decent way to estimate the size of your entire database, considering the data may be any length in a Varchar(MAX) column.

Timothy Kanski
  • 1,861
  • 14
  • 20