1

I am trying to find which user records in a table are taking up the most space. To this end, I am using the DATALENGTH function in SqlServer.

SELECT 
    UserName, 
    SUM(
        ISNULL(DATALENGTH(columnA), 1) +
        ISNULL(DATALENGTH(columnB), 1) +
        ....
        ISNULL(DATALENGTH(columnZ), 1) +
        )/1000000 AS SizeInMegaBytes
FROM MyTable
GROUP BY UserName
ORDER BY SizeInMegaBytes DESC

Results:

+----------+-----------------+
| UserName | SizeInMegaBytes |
+----------+-----------------+
|   User1  |       1700      |
+----------+-----------------+
|   User2  |       1504      |
+----------+-----------------+
|   ....   |       ....      |
+----------+-----------------+
|  User75  |        20       |
+----------+-----------------+

Total Size = 16,523 MB

The only problem is that the results don't match up with the size of the table. I use the built-in stored procedure to get the size of the table

sp_spaceused [MyTable]

Results:

+---------+-------+-------------+-------------+------------+-------------+
| name    | rows  | reserved    | data        | index_size | unused      |
+---------+-------+-------------+-------------+------------+-------------+
| MyTable | 61477 | 59425416 KB | 42482152 KB | 62584 KB   | 16880680 KB |
+---------+-------+-------------+-------------+------------+-------------+

The stored procedure shows the total data size as 42 GB yet the query of all the columns shows 16 GB. What could be taking up the extra space if I have accounted for the size of all the columns?

EDIT - I don't think my issue is the same as the duplicate mentioned because here I am taking the SUM of all the grouped records while the previous question did not. There seems to be such a large disparity between the SUM of the DataLength function and the results of sp_spaceused (29 GB) I don't think it could be accounted for by indexes or header information alone.

webworm
  • 10,587
  • 33
  • 120
  • 217
  • 3
    Possible duplicate of [Difference between sp\_spaceused and DataLength SQL Server](https://stackoverflow.com/questions/768320/difference-between-sp-spaceused-and-datalength-sql-server) – MarkD May 24 '17 at 16:08

1 Answers1

0

First; your math is suspect; 1MB = (1KB * 1KB) = 1024B * 1024B

Second; there could be metadata associated with table and records. Inspecting the table definition can give insight here.

Calvin Taylor
  • 664
  • 4
  • 15
  • I see what you are saying about the math, however, if I change the query to `/ 1048576` is still doesn't account for the difference between 16 GB and 42 GB. Where is the extra 26 GB coming from? – webworm May 24 '17 at 16:19
  • If I am remembering correctly, space used is a combination of multiple fields. You may want to check the documentation [here](https://learn.microsoft.com/en-us/sql/relational-databases/system-stored-procedures/sp-spaceused-transact-sql) – Jacob H May 24 '17 at 16:22
  • @Jacob - I think you are right. That is why I included all columns in the SELECT statement – webworm May 24 '17 at 16:29
  • 26G is a lot of missing data. I'm not sure indexes would account for it. This thread has a script one user wrote as well to do what you're trying to do, it might be worth checking out. https://social.msdn.microsoft.com/Forums/sqlserver/en-US/4ce0aaac-11a1-407b-b2ee-fdd588b52e37/calculating-table-size?forum=sqltools – Calvin Taylor May 25 '17 at 14:08