1

I have a very large read-only DB with ~30 tables. The total size of the DB is around 13TB, with the largest table around 4.5 TB. (There are 10 or so tables with size 1TB+, and then several smaller tables.) Currently the DB is split up into 8 data files, all in the PRIMARY filegroup.

I've applied PAGE level data compression on some of the large tables, which has reduced the size of the DB to 10TB or so, however, I'd really like to reclaim some of the space on disk. (This data set is READ-ONLY - it will never grow.)

I realize that shrinking the files will cause tons of fragmentation, which could be fixed by rebuilding all the indexes, but rebuilding the indexes will probably cause the files to grow all over again...argh!

Which leads to my question(s) about how to reclaim disk space after compression:

  • Is my only solution to copy all the tables / data into new filegroups with smaller files, drop the original tables, and then empty / drop or shrink the original files?
  • is anyone aware of any script or tool that will help me decide the optimum file sizes I need?
  • Would the best practice be to
    1. create new table on new filegroup with clustered index + PAGE compression
    2. insert / select from original table into new table (with TF 610 and tablock)
    3. drop original table
    4. Create non-clustered indexes on new filegroup

This seems like a big undertaking that will take a long time because I'm going to have to basically re-create my entire database...again. Is there a simpler solution that I'm missing?

capnsue
  • 51
  • 2
  • 5
  • Also, is there a performance difference between creating the new clustered index in the new FG with DROP EXISTING, or by just doing INSERT..SELECT after creating the CLI on the new FG? – capnsue Feb 13 '17 at 18:36

1 Answers1

1

Everything has been covered in this whitepaper: Data Compression: Strategy, Capacity Planning and Best Practices

After data compression has completed, the space saved is released to the respective data file(s). However, the space is not released to the filesystem, because the file size doesn’t reduce automatically as part of data compression.

There are several options to release the space to the filesystem by reducing the size of the file(s):

DBCC SHRINKFILE (or) DBCC SHRINKDATABASE :
After DBCC shrink file,the fragmentation will increase.use ALTER INDEX … REORGANIZE and not rebuild .
Also be aware that DBCC SHRINKFILE is single-threaded and may take a long time to complete

If you are compressing all the tables in a filegroup:
- Create a new filegroup.
- Move tables and indexes to the new filegroup while compressing.
After all the tables and indexes from the old filegroup have been compressed and moved to the new filegroup, the old filegroup and its file(s) can be removed to release space to the filesystem.
Be aware of a caveat in this method. If the table has a LOB_DATA allocation unit in the same filegroup, then this method will not move the LOB_DATA to the new filegroup (only the IN_ROW_DATA and ROW_OVERFLOW_DATA allocation units are moved when a clustered index is re-created in a different filegroup). So the old filegroup will not be completely empty, and hence cannot be dropped.

one more option :
There is another solution if you are compressing all the tables in a filegroup. Create an empty table in the new filegroup, compress it, and then copy the data over to the new table using INSERT … SELECT.

TheGameiswar
  • 27,855
  • 8
  • 56
  • 94
  • Thanks, I've read that whitepaper, should have mentioned that. Was wondering if anyone had written scripts or something to automate this process - choosing the correct filesizes, etc, or had some solution that didn't require moving all this data around again... – capnsue Feb 13 '17 at 18:32