0

The company I work for is building a data mart that wants 7 years of data maintained within it. Unfortunately, one table is well over 1 billion records.

My question is this: what would be the best way to keep this table current? (Daily update or quicker)

I know the MERGE statement is quite beneficial for this but I'm hoping to not have to parse through 1 billion records for each MERGE. Table partitioning is out as we do not have the Enterprise edition of SQL Server.

Any direction would be greatly appreciated :)

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
  • 2
    [Table partition](https://learn.microsoft.com/en-us/sql/relational-databases/partitions/create-partitioned-tables-and-indexes?view=sql-server-2017). Partition the table based on age of the data. You can even store each partition on its own storage volume – Code Different Jun 05 '18 at 21:00
  • 2
    Table Partitioning, Compression, and Columnstore indexes are available in all editions of SQL Server 2016 SP1 and later. – David Browne - Microsoft Jun 05 '18 at 21:16

2 Answers2

0

there are several options here; two are explained above on the comment. The answer depends on which operation you want to do on the records. If you just want to modify your recent records, the best way is to keep those records on an active table and move other records to an archive table as an archive. this way you need a scheduled job to move unnecessary records to the archive table.

If you want also to have a reporting module, you may need to provide an additional table which holds some abstracts of data in such a way that you can extract the reports you want.

Afshin Amiri
  • 3,438
  • 1
  • 20
  • 21
0

You will want to seriously consider splitting the table. For example, see the operational versus archive paradigm.

The first step of splitting the data up from such a massive table is to identify the clustering index (if it has one) and all the other indexes as well, because you will want to avoid operations that cause big rebuilds and data shifting.

Otherwise, if you need to keep the status quo, with good indexes (you should bite the bullet and define one if they have somehow survived this long without one), you can rely on the query optimizer to do a quick index seek (much better than a scan, especially a table scan, which is the concern you seem to have). So just write your MERGE statement and make sure to use indexes in the ON clause (and avoid using functions on indexed columns at all costs!).

Elaskanator
  • 1,135
  • 10
  • 28